Git development
 help / color / mirror / Atom feed
* Re: [PATCH v1 0/3] War on dashed-git
From: Johannes Schindelin @ 2020-08-26  8:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King
In-Reply-To: <20200826011718.3186597-1-gitster@pobox.com>

Hi Junio,

On Tue, 25 Aug 2020, Junio C Hamano wrote:

> If we were to propose breaking the 12-year old promise to our users
> that we will keep "git-foo" binaries on disk where "git --exec-path"
> tells them, we first need to gauge how big a population we would be
> hurting with such a move.
>
> So here is a miniseries towards that.  The third one hooks to the
> codepath in git.c::cmd_main() where av[0] is of "git-foo" form and
> we dispatch to "foo" as a builtin command.  In the original code, we
> will die() if "foo" is not a built-in command in this codepath, so
> it is exactly the place we want to catch remaining uses of "git-foo"
> invoking built-in commands.
>
> There are a few legitimate "git-foo" calls made even for built-ins
> and those exceptions are marked in the command-list mechanism, which
> is shared with the help subsystem.  We might want to see if we can
> unify this exception list with what we have in the Makefile as
> BIN_PROGRAMS and what Dscho introduces as ALL_COMMANDS_TO_INSTALL
> in his series.  These have large overlaps in what they mean, but
> they are not exactly identical.

As it happens, I discussed a scenario the other day where dashed
invocations might still happen, and the consensus was that nobody looks at
the output unless things are broken.

So maybe this patch series would be a good first step, but if we truly
wanted to break that 12-year old promise, we might need to have another
patch on top that _does_ install the dashed commands, but into a
subdirectory of `libexec/git-core/` that is only added to the `PATH` when
an "escape hatch"-style config setting is set.

Ciao,
Dscho

^ permalink raw reply

* Re: [PATCH v1 3/3] git: catch an attempt to run "git-foo"
From: Johannes Schindelin @ 2020-08-26  8:06 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King
In-Reply-To: <20200826011718.3186597-4-gitster@pobox.com>

Hi Junio,

On Tue, 25 Aug 2020, Junio C Hamano wrote:

> If we were to propose removing "git-foo" binaries from the
> filesystem for built-in commands, we should first see if there are
> users who will be affected by such a move.  When cmd_main() detects
> we were called not as "git", but as "git-foo", give an error message
> to ask the user to let us know and stop our removal plan, unless we
> are running a selected few programs that MUST be callable in the
> dashed form (e.g. "git-upload-pack").
>
> Those who are always using "git foo" form will not be affected, but
> those who trusted the promise we made to them 12 years ago that by
> prepending the output of $(git --exec-path) to the list of
> directories on their $PATH, they can safely keep writing
> "git-cat-file" will be negatively affected as all their scripts
> assuming the promise will be kept are now broken.

It might be a good idea to also instrument the existing scripts, to show
the same warning unless they were called through the `git` binary.

_If_ we were to do this ;-)

Ciao,
Dscho

^ permalink raw reply

* Re: [PATCH v1 2/3] cvsexportcommit: do not run git programs in dashed form
From: Johannes Schindelin @ 2020-08-26  8:02 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King
In-Reply-To: <20200826011718.3186597-3-gitster@pobox.com>

Hi Junio,

On Tue, 25 Aug 2020, Junio C Hamano wrote:

> This ancient script runs "git-foo" all over the place.  A strange
> thing is that it has t9200 tests successfully running, even though
> it does not seem to futz with PATH to prepend $(git --exec-path)
> output.  It is tempting to declare that the command must be unused,
> but that is left to another topic.

Not surprising at all: when t9200 runs `git cvsexportcommit`, it actually
runs `bin-wrappers/git`, which sets `GIT_EXEC_PATH` to the top-level
directory of the Git source code, then calls the `git` executable which in
turn will set up the `PATH` to prepend `GIT_EXEC_PATH`, and then look for
`git-cvsexportcommit` (which does not exist in `bin-wrappers/`, but in
`GIT_EXEC_PATH`). And of course then `git-rev-parse` is found on the
`PATH`, too.

Slightly more surprising is that my PR build did not fail. I guess we do
test `git svn`, but we skip all CVS-related tests, eh?

Slightly related: it might be a good time to deprecate the CVS-related Git
commands...

Ciao,
Dscho

^ permalink raw reply

* Re: [PATCH] git-gui: accommodate for intent-to-add files
From: Pratyush Yadav @ 2020-08-26 14:52 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Johannes Schindelin via GitGitGadget, git
In-Reply-To: <nycvar.QRO.7.76.6.2008260936010.56@tvgsbejvaqbjf.bet>

On 26/08/20 09:36AM, Johannes Schindelin wrote:
> Hi Pratyush,
> 
> On Wed, 26 Aug 2020, Pratyush Yadav wrote:
> 
> > On 12/08/20 03:06PM, Johannes Schindelin via GitGitGadget wrote:
> > > From: Johannes Schindelin <johannes.schindelin@gmx.de>
> > >
> > > As of Git v2.28.0, the diff for files staged via `git add -N` marks them
> > > as new files. Git GUI was ill-prepared for that, and this patch teaches
> > > Git GUI about them.
> > >
> > > Please note that this will not even fix things with v2.28.0, as the
> > > `rp/apply-cached-with-i-t-a` patches are required on Git's side, too.
> > >
> > > This fixes https://github.com/git-for-windows/git/issues/2779
> > >
> > > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> > > ---
> > >     git-gui: accommodate for intent-to-add files
> > >
> > >     This fixes the intent-to-add bug reported in
> > >     https://github.com/git-for-windows/git/issues/2779: after a file was
> > >     staged with git add -N, staging hunks/lines would fail silently.
> > >
> > >     On its own, this patch is not enough, as it requires the patches
> > >     provided in rp/apply-cached-with-i-t-a to be applied on Git's side.
> > >
> > >     Please note that this patch might need a bit more help, as I do not
> > >     really know whether showing "new file mode 100644" in the diff view is
> > >     desirable, or whether we should somehow try to retain the
> > >     "intent-to-add" state so that unstaging all hunks would return the file
> > >     to "intent-to-add" state.
> >
> > I built latest Git master (e9b77c84a0) which has
> > `rp/apply-cached-with-i-t-a` and tested this patch. It works... for the
> > most part.
> >
> > I can select a line set of lines and they get staged/unstaged, which is
> > good. The part that is not good though is that a lot of common
> > operations still don't work as they should:
> >
> > - I can't click on the icon in the "Unstaged Changes" pane to stage the
> >   whole file. Nothing happens when I do that.
> >
> > - The file that is marked as intent-to-add shows up in both the "Staged
> >   Changes" and "Unstaged Changes" panes, with the "Staged Changes" part
> >   being empty. Ideally it should only show up in the "Unstaged Changes"
> >   pane.
> >
> > - Selecting the whole file and choosing "Stage Lines for Commit" works
> >   well. But choosing "Stage Hunk for Commit" does not. While the changes
> >   do get staged, the UI is not properly updated and the file is still
> >   listed in the "Unstaged Changes" pane.
> >
> >   I think the difference here is because for
> >   `apply_or_revert_range_or_line`, we call `do_rescan` after it to
> >   update the UI, but for `apply_or_revert_hunk` we update the UI
> >   "manually" in the function after we are done applying or reverting the
> >   changes. So the logic to update the UI needs to be updated to account
> >   for this change. Or we can get rid of all that logic and just run a
> >   rescan.
> >
> > And also, like you mentioned, we don't retain the i-t-a state when
> > unstaging. But with some quick testing, I see that Git command line
> > doesn't either (I tried a plain `git restore --staged`). So IMO we
> > should mimic what the command line UI does and not retain the i-t-a
> > state when unstaging.
> 
> To be quite honest, I had hoped that this might be a good patch to start
> from... for somebody else (you?)

I'll take a stab at this during the weekend :-)

-- 
Regards,
Pratyush Yadav

^ permalink raw reply

* Re: [PATCH v1 1/3] transport-helper: do not run git-remote-ext etc. in dashed form
From: Johannes Schindelin @ 2020-08-26  7:55 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Junio C Hamano, Git List, Jeff King
In-Reply-To: <CAPig+cTvLaOD1idfB2M0-QSfXXKBe5-FnWSU9E0PaUMHAoGj1w@mail.gmail.com>

Hi,

On Tue, 25 Aug 2020, Eric Sunshine wrote:

> On Tue, Aug 25, 2020 at 9:17 PM Junio C Hamano <gitster@pobox.com> wrote:
> > Runing them as "git remote-ext" and letting "git" dispatch to
>
> s/Runing/Running/
> s/them/it/
>
> > "remote-ext" would just be fine and is more idiomatic.
> >
> > Signed-off-by: Junio C Hamano <gitster@pobox.com>
> > ---
> > diff --git a/transport-helper.c b/transport-helper.c
> > @@ -128,7 +128,8 @@ static struct child_process *get_helper(struct transport *transport)
> > -       strvec_pushf(&helper->args, "git-remote-%s", data->name);
> > +       strvec_push(&helper->args, "git");
> > +       strvec_pushf(&helper->args, "remote-%s", data->name);
>
> Rather than pushing "git" as the first argument, would it instead be
> more idiomatic to set `helper->git_cmd = 1` (or would that not work
> correctly for some reason)?

That is precisely what I did in Git for Windows:

-- snipsnap --
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Subject: [PATCH] transport-helper: prefer Git's builtins over dashed form

This helps with minimal installations such as MinGit that refuse to
waste .zip real estate by shipping identical copies of builtins (.zip
files do not support hard links).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 transport-helper.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/transport-helper.c b/transport-helper.c
index 35d6437b557..2c4b7a023db 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -130,10 +130,10 @@ static struct child_process *get_helper(struct transport *transport)
 	helper->in = -1;
 	helper->out = -1;
 	helper->err = 0;
-	argv_array_pushf(&helper->args, "git-remote-%s", data->name);
+	argv_array_pushf(&helper->args, "remote-%s", data->name);
 	argv_array_push(&helper->args, transport->remote->name);
 	argv_array_push(&helper->args, remove_ext_force(transport->url));
-	helper->git_cmd = 0;
+	helper->git_cmd = 1;
 	helper->silent_exec_failure = 1;

 	if (have_git_dir())
--
2.28.0.windows.1.18.g5300e52e185


^ permalink raw reply related

* [PATCH] pretty-options.txt: fix --no-abbrev-commit description
From: Sergey Organov @ 2020-08-26 14:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Sergey Organov

Description suggested --no-abbrev-commit negates --oneline as well as any other
option that implies --abbrev-commit. Fix it to say that it's --abbrev-commit
that is negated, not the option that implies it.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
---
 Documentation/pretty-options.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/pretty-options.txt b/Documentation/pretty-options.txt
index 7a6da6db780e..17c5aac4b71d 100644
--- a/Documentation/pretty-options.txt
+++ b/Documentation/pretty-options.txt
@@ -25,8 +25,8 @@ people using 80-column terminals.
 
 --no-abbrev-commit::
 	Show the full 40-byte hexadecimal commit object name. This negates
-	`--abbrev-commit` and those options which imply it such as
-	"--oneline". It also overrides the `log.abbrevCommit` variable.
+	`--abbrev-commit`, either explicit or implied by other options such
+	as "--oneline". It also overrides the `log.abbrevCommit` variable.
 
 --oneline::
 	This is a shorthand for "--pretty=oneline --abbrev-commit"
-- 
2.25.1


^ permalink raw reply related

* Re: [PATCH] git-gui: accommodate for intent-to-add files
From: Johannes Schindelin @ 2020-08-26  7:36 UTC (permalink / raw)
  To: Pratyush Yadav; +Cc: Johannes Schindelin via GitGitGadget, git
In-Reply-To: <20200826113030.xnutfxxfmdhgoq5o@yadavpratyush.com>

Hi Pratyush,

On Wed, 26 Aug 2020, Pratyush Yadav wrote:

> On 12/08/20 03:06PM, Johannes Schindelin via GitGitGadget wrote:
> > From: Johannes Schindelin <johannes.schindelin@gmx.de>
> >
> > As of Git v2.28.0, the diff for files staged via `git add -N` marks them
> > as new files. Git GUI was ill-prepared for that, and this patch teaches
> > Git GUI about them.
> >
> > Please note that this will not even fix things with v2.28.0, as the
> > `rp/apply-cached-with-i-t-a` patches are required on Git's side, too.
> >
> > This fixes https://github.com/git-for-windows/git/issues/2779
> >
> > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> > ---
> >     git-gui: accommodate for intent-to-add files
> >
> >     This fixes the intent-to-add bug reported in
> >     https://github.com/git-for-windows/git/issues/2779: after a file was
> >     staged with git add -N, staging hunks/lines would fail silently.
> >
> >     On its own, this patch is not enough, as it requires the patches
> >     provided in rp/apply-cached-with-i-t-a to be applied on Git's side.
> >
> >     Please note that this patch might need a bit more help, as I do not
> >     really know whether showing "new file mode 100644" in the diff view is
> >     desirable, or whether we should somehow try to retain the
> >     "intent-to-add" state so that unstaging all hunks would return the file
> >     to "intent-to-add" state.
>
> I built latest Git master (e9b77c84a0) which has
> `rp/apply-cached-with-i-t-a` and tested this patch. It works... for the
> most part.
>
> I can select a line set of lines and they get staged/unstaged, which is
> good. The part that is not good though is that a lot of common
> operations still don't work as they should:
>
> - I can't click on the icon in the "Unstaged Changes" pane to stage the
>   whole file. Nothing happens when I do that.
>
> - The file that is marked as intent-to-add shows up in both the "Staged
>   Changes" and "Unstaged Changes" panes, with the "Staged Changes" part
>   being empty. Ideally it should only show up in the "Unstaged Changes"
>   pane.
>
> - Selecting the whole file and choosing "Stage Lines for Commit" works
>   well. But choosing "Stage Hunk for Commit" does not. While the changes
>   do get staged, the UI is not properly updated and the file is still
>   listed in the "Unstaged Changes" pane.
>
>   I think the difference here is because for
>   `apply_or_revert_range_or_line`, we call `do_rescan` after it to
>   update the UI, but for `apply_or_revert_hunk` we update the UI
>   "manually" in the function after we are done applying or reverting the
>   changes. So the logic to update the UI needs to be updated to account
>   for this change. Or we can get rid of all that logic and just run a
>   rescan.
>
> And also, like you mentioned, we don't retain the i-t-a state when
> unstaging. But with some quick testing, I see that Git command line
> doesn't either (I tried a plain `git restore --staged`). So IMO we
> should mimic what the command line UI does and not retain the i-t-a
> state when unstaging.

To be quite honest, I had hoped that this might be a good patch to start
from... for somebody else (you?)

:-)

Ciao,
Dscho

^ permalink raw reply

* Re: [PATCH v2 2/7] maintenance: store the "last run" time in config
From: Derrick Stolee @ 2020-08-26 13:34 UTC (permalink / raw)
  To: Junio C Hamano, Derrick Stolee via GitGitGadget
  Cc: git, sandals, steadmon, jrnieder, peff, congdanhqx,
	phillip.wood123, emilyshaffer, sluongng, jonathantanmy,
	Derrick Stolee, Derrick Stolee
In-Reply-To: <xmqqh7sqz93k.fsf@gitster.c.googlers.com>

On 8/25/2020 5:52 PM, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> I selected the timestamp before the task, as opposed to after the task,
>> for a couple reasons:
>>
>>  1. The time the task takes to execute should not contribute to the
>>     interval between running the tasks.
> 
> ... as long as the run time is sufficiently shorter than the
> interval, that is.  If a task takes 10-30 minutes depending on how
> dirty the repository is, it does not make sense to even try to run
> it every 15 minutes.

Definitely. The lock on the object database from earlier prevents these
longer-than-anticipated tasks from stacking.

>> If a daily task takes 10 minutes
>>     to run, then every day the execution will drift by at least 10
>>     minutes.
> 
> That is not incorrect per-se, but it does not tell us why drifting
> by 10 minutes is a bad thing.

True.

>>  2. If the task fails for some unforseen reason, it would be good to
>>     indicate that we _attempted_ the task at a certain timestamp. This
>>     will avoid spamming a repository that is in a bad state.
> 
> Absolutely.
> 
>> +static void update_last_run(struct maintenance_task *task)
>> +{
>> +	timestamp_t now = approxidate("now");
>> +	struct strbuf config = STRBUF_INIT;
>> +	struct strbuf value = STRBUF_INIT;
>> +	strbuf_addf(&config, "maintenance.%s.lastrun", task->name);
>> +	strbuf_addf(&value, "%"PRItime"", now);
> 
> So is this essentially meant as a human-unreadable opaque value,
> like we have in the commit object header lines?  I do not have a
> strong opinion, but it would be nice to allow curious to casually
> read it.  Perhaps "git config --type=timestamp maintenance.lastrun"
> can be taught to pretty print its value?

Good idea. I will think on this. Of course, we already have
--type=expiry-date, which does the opposite. Perhaps this config
value should be a human-readable date and then be parsed into a
timestamp in-process using git_config_expiry_date()?

I have mixed feelings on using that format, because it can store
both a fixed or relative datetime. The *.lastRun config really
wants a _fixed_ datetime, but the *.schedule config (in the next
patch) would want a _relative_ datetime. This also allows things
like "now" or "never", so it presents a lot of flexibility for
users. A nightmare to test, but perhaps that flexibility is
useful.

(Of course, in another thread you mentioned multiple `crontab`
lines, which might make this entire discussion irrelevant. I'll
follow up there.)

Thanks,
-Stolee

^ permalink raw reply

* Re: [PATCH 0/7] [RFC] Maintenance III: background maintenance
From: Michal Suchánek @ 2020-08-26 12:42 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, sandals, steadmon, jrnieder, peff, congdanhqx,
	phillip.wood123, emilyshaffer, sluongng, jonathantanmy,
	Derrick Stolee
In-Reply-To: <pull.680.git.1597857408.gitgitgadget@gmail.com>

On Wed, Aug 19, 2020 at 05:16:41PM +0000, Derrick Stolee via GitGitGadget wrote:
> This is based on ds/maintenance-part-2, but with some local updates to
> review feedback. It won't apply cleanly right now. This RFC is for early
> feedback and not intended to make a new tracking branch until v2.
> 
> This RFC is intended to show how I hope to integrate true background
> maintenance into Git. As opposed to my original RFC [1], this entirely
> integrates with cron (through crontab [-e|-l]) to launch maintenance
> commands in the background.
> 
> [1] 
> https://lore.kernel.org/git/pull.597.git.1585946894.gitgitgadget@gmail.com/
> 
> Some preliminary work is done to allow a new --scheduled option that
> triggers enabled tasks only if they have not been run in some amount of
> time. The timestamp of the previous run is stored in the 
> maintenance.<task>.lastRun config value while the interval is stored in the 
> maintenance.<task>.schedule config value.
This changes the config file from read-mostly to continuously updated. Is
that desirable?
In particular it significanly increases the risk of race with the user
editing the file.
I think timestamps are not configuration and should be written to some
other file.
Or is there already a core git feature that continuously updates the
config file?

Thanks

Michal

^ permalink raw reply

* Re: [PATCH v2 1/7] maintenance: optionally skip --auto process
From: Derrick Stolee @ 2020-08-26 12:29 UTC (permalink / raw)
  To: Junio C Hamano, Derrick Stolee via GitGitGadget
  Cc: git, sandals, steadmon, jrnieder, peff, congdanhqx,
	phillip.wood123, emilyshaffer, sluongng, jonathantanmy,
	Derrick Stolee, Derrick Stolee
In-Reply-To: <xmqqlfi2z9ht.fsf@gitster.c.googlers.com>

On 8/25/2020 5:44 PM, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> From: Derrick Stolee <dstolee@microsoft.com>
>>
>> Some commands run 'git maintenance run --auto --[no-]quiet' after doing
>> their normal work, as a way to keep repositories clean as they are used.
>> Currently, users who do not want this maintenance to occur would set the
>> 'gc.auto' config option to 0 to avoid the 'gc' task from running.
>> However, this does not stop the extra process invocation.
> 
> OK, that is because the configuration is checked on the other side,
> and the new check is implemented on this side before we decide to
> spawn the maintenance task.
> 
> It sounds like a change worth having without even waiting for the
> "git maintenance" to materialize ;-).

True. This one could be pulled out of Part III and placed any time
after Part I (outside of test script context conflicts). The only
reason to wait until after Part I is that the name "maintenance.auto"
implies a connection to the maintenance builtin instead of gc.

Before the maintenance builtin, it would be natural to see "gc.auto=0"
and do this same behavior, but that will not be general enough in the
future.

If you prefer, I can pull this out into a series on its own to be
tracked separately.

Thanks,
-Stolee


^ permalink raw reply

* [PATCH v3 2/3] Optionally skip linking/copying the built-ins
From: Johannes Schindelin via GitGitGadget @ 2020-08-26 11:56 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Johannes Schindelin
In-Reply-To: <pull.411.v3.git.1598443012.gitgitgadget@gmail.com>

From: Johannes Schindelin <johannes.schindelin@gmx.de>

For a long time already, the non-dashed form of the built-ins is the
recommended way to write scripts, i.e. it is better to call `git merge
[...]` than to call `git-merge [...]`.

While Git still supports the dashed form (by hard-linking the `git`
executable to the dashed name in `libexec/git-core/`), in practice, it
is probably almost irrelevant.

However, we *do* care about keeping people's scripts working (even if
they were written before the non-dashed form started to be recommended).

Keeping this backwards-compatibility is not necessarily cheap, though:
even so much as amending the tip commit in a git.git checkout will
require re-linking all of those dashed commands. On this developer's
laptop, this makes a noticeable difference:

	$ touch version.c && time make
	    CC version.o
	    AR libgit.a
	    LINK git-bugreport.exe
	    [... 11 similar lines ...]
	    LN/CP git-remote-https.exe
	    LN/CP git-remote-ftp.exe
	    LN/CP git-remote-ftps.exe
	    LINK git.exe
	    BUILTIN git-add.exe
	    [... 123 similar lines ...]
	    BUILTIN all
	    SUBDIR git-gui
	    SUBDIR gitk-git
	    SUBDIR templates
	    LINK t/helper/test-fake-ssh.exe
	    LINK t/helper/test-line-buffer.exe
	    LINK t/helper/test-svn-fe.exe
	    LINK t/helper/test-tool.exe

	real    0m36.633s
	user    0m3.794s
	sys     0m14.141s

	$ touch version.c && time make SKIP_DASHED_BUILT_INS=1
	    CC version.o
	    AR libgit.a
	    LINK git-bugreport.exe
	    [... 11 similar lines ...]
	    LN/CP git-remote-https.exe
	    LN/CP git-remote-ftp.exe
	    LN/CP git-remote-ftps.exe
	    LINK git.exe
	    BUILTIN git-receive-pack.exe
	    BUILTIN git-upload-archive.exe
	    BUILTIN git-upload-pack.exe
	    BUILTIN all
	    SUBDIR git-gui
	    SUBDIR gitk-git
	    SUBDIR templates
	    LINK t/helper/test-fake-ssh.exe
	    LINK t/helper/test-line-buffer.exe
	    LINK t/helper/test-svn-fe.exe
	    LINK t/helper/test-tool.exe

	real    0m23.717s
	user    0m1.562s
	sys     0m5.210s

Also, `.zip` files do not have any standardized support for hard-links,
therefore "zipping up" the executables will result in inflated disk
usage. (To keep down the size of the "MinGit" variant of Git for
Windows, which is distributed as a `.zip` file, the hard-links are
excluded specifically.)

In addition to that, some programs that are regularly used to assess
disk usage fail to realize that those are hard-links, and heavily
overcount disk usage. Most notably, this was the case with Windows
Explorer up until the last couple of Windows 10 versions. See e.g.
https://github.com/msysgit/msysgit/issues/58.

To save on the time needed to hard-link these dashed commands, with the
plan to eventually stop shipping with those hard-links on Windows, let's
introduce a Makefile knob to skip generating them.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Makefile | 55 +++++++++++++++++++++++++++++++++++++------------------
 1 file changed, 37 insertions(+), 18 deletions(-)

diff --git a/Makefile b/Makefile
index 66b6e076e2..0a09146fb3 100644
--- a/Makefile
+++ b/Makefile
@@ -348,6 +348,9 @@ all::
 # Define NO_INSTALL_HARDLINKS if you prefer to use either symbolic links or
 # copies to install built-in git commands e.g. git-cat-file.
 #
+# Define SKIP_DASHED_BUILT_INS if you do not need the dashed versions of the
+# built-ins to be linked/copied at all.
+#
 # Define USE_NED_ALLOCATOR if you want to replace the platforms default
 # memory allocators with the nedmalloc allocator written by Niall Douglas.
 #
@@ -775,6 +778,16 @@ BUILT_INS += git-whatchanged$X
 # what 'all' will build and 'install' will install in gitexecdir,
 # excluding programs for built-in commands
 ALL_PROGRAMS = $(PROGRAMS) $(SCRIPTS)
+ALL_COMMANDS_TO_INSTALL = $(ALL_PROGRAMS)
+ifeq (,$(SKIP_DASHED_BUILT_INS))
+ALL_COMMANDS_TO_INSTALL += $(BUILT_INS)
+else
+# git-upload-pack, git-receive-pack and git-upload-archive are special: they
+# are _expected_ to be present in the `bin/` directory in their dashed form.
+ALL_COMMANDS_TO_INSTALL += git-receive-pack$(X)
+ALL_COMMANDS_TO_INSTALL += git-upload-archive$(X)
+ALL_COMMANDS_TO_INSTALL += git-upload-pack$(X)
+endif
 
 # what 'all' will build but not install in gitexecdir
 OTHER_PROGRAMS = git$X
@@ -2066,9 +2079,9 @@ profile-fast: profile-clean
 	$(MAKE) PROFILE=USE all
 
 
-all:: $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS) $(OTHER_PROGRAMS) GIT-BUILD-OPTIONS
+all:: $(ALL_COMMANDS_TO_INSTALL) $(SCRIPT_LIB) $(OTHER_PROGRAMS) GIT-BUILD-OPTIONS
 ifneq (,$X)
-	$(QUIET_BUILT_IN)$(foreach p,$(patsubst %$X,%,$(filter %$X,$(ALL_PROGRAMS) $(BUILT_INS) git$X)), test -d '$p' -o '$p' -ef '$p$X' || $(RM) '$p';)
+	$(QUIET_BUILT_IN)$(foreach p,$(patsubst %$X,%,$(filter %$X,$(ALL_COMMANDS_TO_INSTALL) git$X)), test -d '$p' -o '$p' -ef '$p$X' || $(RM) '$p';)
 endif
 
 all::
@@ -2928,7 +2941,7 @@ ifndef NO_TCLTK
 	$(MAKE) -C git-gui gitexecdir='$(gitexec_instdir_SQ)' install
 endif
 ifneq (,$X)
-	$(foreach p,$(patsubst %$X,%,$(filter %$X,$(ALL_PROGRAMS) $(BUILT_INS) git$X)), test '$(DESTDIR_SQ)$(gitexec_instdir_SQ)/$p' -ef '$(DESTDIR_SQ)$(gitexec_instdir_SQ)/$p$X' || $(RM) '$(DESTDIR_SQ)$(gitexec_instdir_SQ)/$p';)
+	$(foreach p,$(patsubst %$X,%,$(filter %$X,$(ALL_COMMANDS_TO_INSTALL) git$X)), test '$(DESTDIR_SQ)$(gitexec_instdir_SQ)/$p' -ef '$(DESTDIR_SQ)$(gitexec_instdir_SQ)/$p$X' || $(RM) '$(DESTDIR_SQ)$(gitexec_instdir_SQ)/$p';)
 endif
 
 	bindir=$$(cd '$(DESTDIR_SQ)$(bindir_SQ)' && pwd) && \
@@ -2946,21 +2959,27 @@ endif
 	} && \
 	for p in $(filter $(install_bindir_programs),$(BUILT_INS)); do \
 		$(RM) "$$bindir/$$p" && \
-		test -n "$(INSTALL_SYMLINKS)" && \
-		ln -s "git$X" "$$bindir/$$p" || \
-		{ test -z "$(NO_INSTALL_HARDLINKS)" && \
-		  ln "$$bindir/git$X" "$$bindir/$$p" 2>/dev/null || \
-		  ln -s "git$X" "$$bindir/$$p" 2>/dev/null || \
-		  cp "$$bindir/git$X" "$$bindir/$$p" || exit; } \
+		if test -z "$(SKIP_DASHED_BUILT_INS)"; \
+		then \
+			test -n "$(INSTALL_SYMLINKS)" && \
+			ln -s "git$X" "$$bindir/$$p" || \
+			{ test -z "$(NO_INSTALL_HARDLINKS)" && \
+			  ln "$$bindir/git$X" "$$bindir/$$p" 2>/dev/null || \
+			  ln -s "git$X" "$$bindir/$$p" 2>/dev/null || \
+			  cp "$$bindir/git$X" "$$bindir/$$p" || exit; }; \
+		fi \
 	done && \
 	for p in $(BUILT_INS); do \
 		$(RM) "$$execdir/$$p" && \
-		test -n "$(INSTALL_SYMLINKS)" && \
-		ln -s "$$destdir_from_execdir_SQ/$(bindir_relative_SQ)/git$X" "$$execdir/$$p" || \
-		{ test -z "$(NO_INSTALL_HARDLINKS)" && \
-		  ln "$$execdir/git$X" "$$execdir/$$p" 2>/dev/null || \
-		  ln -s "git$X" "$$execdir/$$p" 2>/dev/null || \
-		  cp "$$execdir/git$X" "$$execdir/$$p" || exit; } \
+		if test -z "$(SKIP_DASHED_BUILT_INS)"; \
+		then \
+			test -n "$(INSTALL_SYMLINKS)" && \
+			ln -s "$$destdir_from_execdir_SQ/$(bindir_relative_SQ)/git$X" "$$execdir/$$p" || \
+			{ test -z "$(NO_INSTALL_HARDLINKS)" && \
+			  ln "$$execdir/git$X" "$$execdir/$$p" 2>/dev/null || \
+			  ln -s "git$X" "$$execdir/$$p" 2>/dev/null || \
+			  cp "$$execdir/git$X" "$$execdir/$$p" || exit; }; \
+		fi \
 	done && \
 	remote_curl_aliases="$(REMOTE_CURL_ALIASES)" && \
 	for p in $$remote_curl_aliases; do \
@@ -3051,7 +3070,7 @@ ifneq ($(INCLUDE_DLLS_IN_ARTIFACTS),)
 OTHER_PROGRAMS += $(shell echo *.dll t/helper/*.dll)
 endif
 
-artifacts-tar:: $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS) $(OTHER_PROGRAMS) \
+artifacts-tar:: $(ALL_COMMANDS_TO_INSTALL) $(SCRIPT_LIB) $(OTHER_PROGRAMS) \
 		GIT-BUILD-OPTIONS $(TEST_PROGRAMS) $(test_bindir_programs) \
 		$(MOFILES)
 	$(QUIET_SUBDIR0)templates $(QUIET_SUBDIR1) \
@@ -3146,7 +3165,7 @@ endif
 
 ### Check documentation
 #
-ALL_COMMANDS = $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS)
+ALL_COMMANDS = $(ALL_COMMANDS_TO_INSTALL) $(SCRIPT_LIB)
 ALL_COMMANDS += git
 ALL_COMMANDS += git-citool
 ALL_COMMANDS += git-gui
@@ -3186,7 +3205,7 @@ check-docs::
 		    -e 's/\.txt//'; \
 	) | while read how cmd; \
 	do \
-		case " $(patsubst %$X,%,$(ALL_COMMANDS) $(EXCLUDED_PROGRAMS)) " in \
+		case " $(patsubst %$X,%,$(ALL_COMMANDS) $(BUILT_INS) $(EXCLUDED_PROGRAMS)) " in \
 		*" $$cmd "*)	;; \
 		*) echo "removed but $$how: $$cmd" ;; \
 		esac; \
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH v3 3/3] ci: stop linking built-ins to the dashed versions
From: Johannes Schindelin via GitGitGadget @ 2020-08-26 11:56 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Johannes Schindelin
In-Reply-To: <pull.411.v3.git.1598443012.gitgitgadget@gmail.com>

From: Johannes Schindelin <johannes.schindelin@gmx.de>

Originally, all of Git's subcommands were implemented in their own
executable/script, using the naming scheme `git-<command-name>`. When
more and more functionality was turned into built-in commands (i.e. the
`git` executable could run them without spawning a separate process),
for backwards-compatibility, we hard-link the `git` executable to
`git-<built-in>` for every built-in.

This backwards-compatibility was needed to support scripts that called
the dashed form, even if we deprecated that a _long_ time ago.

For that reason, we just introduced a Makefile knob to skip linking
them. To make sure that this keeps working, teach the CI
(and PR) builds to skip generating those hard-links.

This is actually not such a big change: e4597aae6590 (run test suite
without dashed git-commands in PATH, 2009-12-02) made sure that our test
suite does not require dashed commands. With this Makefile knob, the
commitment is just a little stronger (running tests with `--with-dashes`
would _still_ not see the dashed form of the built-ins).

There is a subtle change in behavior with this patch, though: as we no
longer even _build_ the dashed executables, running the test suite would
fail if any of Git's scripted commands (e.g. `git-request-pull`) still
This would have succeeded previously (and would have been unintentional,
of course) because `bin-wrappers/git` sets `GIT_EXEC_PATH` to the
top-level directory (which would still have contained, say,
`git-rev-parse`).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 ci/run-build-and-tests.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
index 6c27b886b8..1df9402c3b 100755
--- a/ci/run-build-and-tests.sh
+++ b/ci/run-build-and-tests.sh
@@ -10,7 +10,7 @@ windows*) cmd //c mklink //j t\\.prove "$(cygpath -aw "$cache_dir/.prove")";;
 *) ln -s "$cache_dir/.prove" t/.prove;;
 esac
 
-make
+make SKIP_DASHED_BUILT_INS=YesPlease
 case "$jobname" in
 linux-gcc)
 	make test
-- 
gitgitgadget

^ permalink raw reply related

* [PATCH v3 1/3] msvc: copy the correct `.pdb` files in the Makefile target `install`
From: Johannes Schindelin via GitGitGadget @ 2020-08-26 11:56 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Johannes Schindelin
In-Reply-To: <pull.411.v3.git.1598443012.gitgitgadget@gmail.com>

From: Johannes Schindelin <johannes.schindelin@gmx.de>

There is a hard-coded list of `.pdb` files to copy. But we are about to
introduce the `SKIP_DASHED_BUILT_INS` knob in the `Makefile`, which
might make this hard-coded list incorrect.

Let's switch to a dynamically-generated list instead.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 Makefile | 16 ++--------------
 1 file changed, 2 insertions(+), 14 deletions(-)

diff --git a/Makefile b/Makefile
index 65f8cfb236..66b6e076e2 100644
--- a/Makefile
+++ b/Makefile
@@ -2899,20 +2899,8 @@ ifdef MSVC
 	# have already been rolled up into the exe's pdb file.
 	# We DO NOT have pdb files for the builtin commands (like git-status.exe)
 	# because it is just a copy/hardlink of git.exe, rather than a unique binary.
-	$(INSTALL) git.pdb '$(DESTDIR_SQ)$(bindir_SQ)'
-	$(INSTALL) git-shell.pdb '$(DESTDIR_SQ)$(bindir_SQ)'
-	$(INSTALL) git-upload-pack.pdb '$(DESTDIR_SQ)$(bindir_SQ)'
-	$(INSTALL) git-credential-store.pdb '$(DESTDIR_SQ)$(gitexec_instdir_SQ)'
-	$(INSTALL) git-daemon.pdb '$(DESTDIR_SQ)$(gitexec_instdir_SQ)'
-	$(INSTALL) git-fast-import.pdb '$(DESTDIR_SQ)$(gitexec_instdir_SQ)'
-	$(INSTALL) git-http-backend.pdb '$(DESTDIR_SQ)$(gitexec_instdir_SQ)'
-	$(INSTALL) git-http-fetch.pdb '$(DESTDIR_SQ)$(gitexec_instdir_SQ)'
-	$(INSTALL) git-http-push.pdb '$(DESTDIR_SQ)$(gitexec_instdir_SQ)'
-	$(INSTALL) git-imap-send.pdb '$(DESTDIR_SQ)$(gitexec_instdir_SQ)'
-	$(INSTALL) git-remote-http.pdb '$(DESTDIR_SQ)$(gitexec_instdir_SQ)'
-	$(INSTALL) git-remote-testsvn.pdb '$(DESTDIR_SQ)$(gitexec_instdir_SQ)'
-	$(INSTALL) git-sh-i18n--envsubst.pdb '$(DESTDIR_SQ)$(gitexec_instdir_SQ)'
-	$(INSTALL) git-show-index.pdb '$(DESTDIR_SQ)$(gitexec_instdir_SQ)'
+	$(INSTALL) $(patsubst %.exe,%.pdb,$(filter-out $(BUILT_INS),$(patsubst %,%$X,$(BINDIR_PROGRAMS_NEED_X)))) '$(DESTDIR_SQ)$(bindir_SQ)'
+	$(INSTALL) $(patsubst %.exe,%.pdb,$(filter-out $(BUILT_INS) $(REMOTE_CURL_ALIASES),$(PROGRAMS))) '$(DESTDIR_SQ)$(gitexec_instdir_SQ)'
 ifndef DEBUG
 	$(INSTALL) $(vcpkg_rel_bin)/*.dll '$(DESTDIR_SQ)$(bindir_SQ)'
 	$(INSTALL) $(vcpkg_rel_bin)/*.pdb '$(DESTDIR_SQ)$(bindir_SQ)'
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH v3 0/3] Optionally skip linking/copying the built-ins
From: Johannes Schindelin via GitGitGadget @ 2020-08-26 11:56 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin
In-Reply-To: <pull.411.v2.git.1598283480.gitgitgadget@gmail.com>

The dashed form of the built-ins is so passé.

Incidentally, this also handles the .pdb issue in MSVC's install Makefile
target that Peff pointed out in the context of the "slimming down" patch
series
[https://lore.kernel.org/git/20200813145719.GA891370@coredump.intra.peff.net/]
.

This addresses https://github.com/gitgitgadget/git/issues/406

Changes since v2:

 * Reworded and clarified the commit messages of the second and third patch.

Changes since v1:

 * Fixed check-docs under SKIP_DASHED_BUILT_INS
 * Renamed ALL_PROGRAMS_AND_BUILT_INS to ALL_COMMANDS_TO_INSTALL to reflect
   its purpose better.
 * Revamped the commit message of patch 2/3 and 3/3.

Johannes Schindelin (3):
  msvc: copy the correct `.pdb` files in the Makefile target `install`
  Optionally skip linking/copying the built-ins
  ci: stop linking built-ins to the dashed versions

 Makefile                  | 71 +++++++++++++++++++++------------------
 ci/run-build-and-tests.sh |  2 +-
 2 files changed, 40 insertions(+), 33 deletions(-)


base-commit: 878e727637ec5815ccb3301eb994a54df95b21b8
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-411%2Fdscho%2Foptionally-skip-dashed-built-ins-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-411/dscho/optionally-skip-dashed-built-ins-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/411

Range-diff vs v2:

 1:  1880a0e4bf = 1:  1880a0e4bf msvc: copy the correct `.pdb` files in the Makefile target `install`
 2:  166bd0d8fb ! 2:  52deafded5 Optionally skip linking/copying the built-ins
     @@ Commit message
          executable to the dashed name in `libexec/git-core/`), in practice, it
          is probably almost irrelevant.
      
     -    In fact, some platforms (such as Windows) only started gaining
     -    meaningful Git support _after_ the dashed form was deprecated, and
     -    therefore one would expect that all this hard-linking is unnecessary on
     -    those platforms.
     +    However, we *do* care about keeping people's scripts working (even if
     +    they were written before the non-dashed form started to be recommended).
     +
     +    Keeping this backwards-compatibility is not necessarily cheap, though:
     +    even so much as amending the tip commit in a git.git checkout will
     +    require re-linking all of those dashed commands. On this developer's
     +    laptop, this makes a noticeable difference:
     +
     +            $ touch version.c && time make
     +                CC version.o
     +                AR libgit.a
     +                LINK git-bugreport.exe
     +                [... 11 similar lines ...]
     +                LN/CP git-remote-https.exe
     +                LN/CP git-remote-ftp.exe
     +                LN/CP git-remote-ftps.exe
     +                LINK git.exe
     +                BUILTIN git-add.exe
     +                [... 123 similar lines ...]
     +                BUILTIN all
     +                SUBDIR git-gui
     +                SUBDIR gitk-git
     +                SUBDIR templates
     +                LINK t/helper/test-fake-ssh.exe
     +                LINK t/helper/test-line-buffer.exe
     +                LINK t/helper/test-svn-fe.exe
     +                LINK t/helper/test-tool.exe
     +
     +            real    0m36.633s
     +            user    0m3.794s
     +            sys     0m14.141s
     +
     +            $ touch version.c && time make SKIP_DASHED_BUILT_INS=1
     +                CC version.o
     +                AR libgit.a
     +                LINK git-bugreport.exe
     +                [... 11 similar lines ...]
     +                LN/CP git-remote-https.exe
     +                LN/CP git-remote-ftp.exe
     +                LN/CP git-remote-ftps.exe
     +                LINK git.exe
     +                BUILTIN git-receive-pack.exe
     +                BUILTIN git-upload-archive.exe
     +                BUILTIN git-upload-pack.exe
     +                BUILTIN all
     +                SUBDIR git-gui
     +                SUBDIR gitk-git
     +                SUBDIR templates
     +                LINK t/helper/test-fake-ssh.exe
     +                LINK t/helper/test-line-buffer.exe
     +                LINK t/helper/test-svn-fe.exe
     +                LINK t/helper/test-tool.exe
     +
     +            real    0m23.717s
     +            user    0m1.562s
     +            sys     0m5.210s
     +
     +    Also, `.zip` files do not have any standardized support for hard-links,
     +    therefore "zipping up" the executables will result in inflated disk
     +    usage. (To keep down the size of the "MinGit" variant of Git for
     +    Windows, which is distributed as a `.zip` file, the hard-links are
     +    excluded specifically.)
      
          In addition to that, some programs that are regularly used to assess
          disk usage fail to realize that those are hard-links, and heavily
          overcount disk usage. Most notably, this was the case with Windows
     -    Explorer up until the last couple of Windows 10 versions.
     +    Explorer up until the last couple of Windows 10 versions. See e.g.
     +    https://github.com/msysgit/msysgit/issues/58.
      
     -    To save on the time needed to hard-link these dashed commands, and to
     -    eventually stop shipping with those hard-links on Windows, let's
     +    To save on the time needed to hard-link these dashed commands, with the
     +    plan to eventually stop shipping with those hard-links on Windows, let's
          introduce a Makefile knob to skip generating them.
      
          Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
 3:  ea23ba5e26 ! 3:  99a5328492 ci: stop linking built-ins to the dashed versions
     @@ Commit message
          the dashed form, even if we deprecated that a _long_ time ago.
      
          For that reason, we just introduced a Makefile knob to skip linking
     -    them. TO make sure that this keeps working, teach the CI
     +    them. To make sure that this keeps working, teach the CI
          (and PR) builds to skip generating those hard-links.
      
     +    This is actually not such a big change: e4597aae6590 (run test suite
     +    without dashed git-commands in PATH, 2009-12-02) made sure that our test
     +    suite does not require dashed commands. With this Makefile knob, the
     +    commitment is just a little stronger (running tests with `--with-dashes`
     +    would _still_ not see the dashed form of the built-ins).
     +
     +    There is a subtle change in behavior with this patch, though: as we no
     +    longer even _build_ the dashed executables, running the test suite would
     +    fail if any of Git's scripted commands (e.g. `git-request-pull`) still
     +    This would have succeeded previously (and would have been unintentional,
     +    of course) because `bin-wrappers/git` sets `GIT_EXEC_PATH` to the
     +    top-level directory (which would still have contained, say,
     +    `git-rev-parse`).
     +
          Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
      
       ## ci/run-build-and-tests.sh ##

-- 
gitgitgadget

^ permalink raw reply

* Re: [PATCH v2 3/3] ci: stop linking built-ins to the dashed versions
From: Johannes Schindelin @ 2020-08-26  4:19 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: SZEDER Gábor, Johannes Schindelin via GitGitGadget, git
In-Reply-To: <xmqq364a3f6r.fsf@gitster.c.googlers.com>

[-- Attachment #1: Type: text/plain, Size: 2005 bytes --]

Hi,

On Tue, 25 Aug 2020, Junio C Hamano wrote:

> SZEDER Gábor <szeder.dev@gmail.com> writes:
>
> > I'm afraid I don't understand this patch or the previous one (or
> > both?).  So this new Makefile knob stops hard-linking the dashed
> > builtins _during 'make install'_, but it doesn't affect how Git is
> > built by the default target.  And our CI jobs only build Git by the
> > default target, but don't run 'make install', so setting
> > SKIP_DASHED_BUILT_INS wouldn't have any affect anyway.
>
> Very very true.  Let's drop 3/3 if it is not testing anything new.
>
> I do understand the concern 2/3 wants to address, and it would be a
> real one to you especially if you come from Windows.  People on the
> platform wouldn't be able to use shell scripts written in 12 years
> ago or written with the promise we made to our users 12 years ago,
> and unlike hardlink-capable platforms it incurs real cost to install
> these individual binaries on disk.

Actually, `SKIP_DASHED_BUILT_INS` does not _only_ have an impact on `make
install`:

	$ rm git-add.exe && make
	    BUILTIN git-add.exe
	    BUILTIN all
	    SUBDIR git-gui
	    SUBDIR gitk-git
	    SUBDIR templates

	$ rm git-add.exe && make SKIP_DASHED_BUILT_INS=1
	    BUILTIN all
	    SUBDIR git-gui
	    SUBDIR gitk-git
	    SUBDIR templates

See how `git-add.exe` is linked in the first, but not in the second run?

So the difference 3/3 has is that those hard-linked executables are not
even generated. Now, _technically_ this should not result in any change
because we run the test suite without `--with-dashes`.

Practically, it _does_ make a difference, though, as `bin-wrappers/git`
_specifically_ sets `GIT_EXEC_PATH` to the top-level directory, i.e.
`git-add.exe` _would_ be found if any core Git command that is still
implemented as a script called `git-add`.

Therefore, 3/3 makes sure that we really, really, really do not use those
dashed invocations ourselves.

Ciao,
Dscho

^ permalink raw reply

* Re: [PATCH] git-gui: accommodate for intent-to-add files
From: Pratyush Yadav @ 2020-08-26 11:30 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget; +Cc: git, Johannes Schindelin
In-Reply-To: <pull.699.git.1597244777943.gitgitgadget@gmail.com>

Hi Johannes,

Thanks for the patch.

On 12/08/20 03:06PM, Johannes Schindelin via GitGitGadget wrote:
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> 
> As of Git v2.28.0, the diff for files staged via `git add -N` marks them
> as new files. Git GUI was ill-prepared for that, and this patch teaches
> Git GUI about them.
> 
> Please note that this will not even fix things with v2.28.0, as the
> `rp/apply-cached-with-i-t-a` patches are required on Git's side, too.
> 
> This fixes https://github.com/git-for-windows/git/issues/2779
> 
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>     git-gui: accommodate for intent-to-add files
>     
>     This fixes the intent-to-add bug reported in 
>     https://github.com/git-for-windows/git/issues/2779: after a file was
>     staged with git add -N, staging hunks/lines would fail silently.
>     
>     On its own, this patch is not enough, as it requires the patches
>     provided in rp/apply-cached-with-i-t-a to be applied on Git's side.
>     
>     Please note that this patch might need a bit more help, as I do not
>     really know whether showing "new file mode 100644" in the diff view is
>     desirable, or whether we should somehow try to retain the
>     "intent-to-add" state so that unstaging all hunks would return the file
>     to "intent-to-add" state.

I built latest Git master (e9b77c84a0) which has 
`rp/apply-cached-with-i-t-a` and tested this patch. It works... for the 
most part.

I can select a line set of lines and they get staged/unstaged, which is 
good. The part that is not good though is that a lot of common 
operations still don't work as they should:

- I can't click on the icon in the "Unstaged Changes" pane to stage the 
  whole file. Nothing happens when I do that.

- The file that is marked as intent-to-add shows up in both the "Staged 
  Changes" and "Unstaged Changes" panes, with the "Staged Changes" part 
  being empty. Ideally it should only show up in the "Unstaged Changes" 
  pane.

- Selecting the whole file and choosing "Stage Lines for Commit" works 
  well. But choosing "Stage Hunk for Commit" does not. While the changes 
  do get staged, the UI is not properly updated and the file is still 
  listed in the "Unstaged Changes" pane.

  I think the difference here is because for 
  `apply_or_revert_range_or_line`, we call `do_rescan` after it to 
  update the UI, but for `apply_or_revert_hunk` we update the UI 
  "manually" in the function after we are done applying or reverting the 
  changes. So the logic to update the UI needs to be updated to account 
  for this change. Or we can get rid of all that logic and just run a 
  rescan.

And also, like you mentioned, we don't retain the i-t-a state when 
unstaging. But with some quick testing, I see that Git command line 
doesn't either (I tried a plain `git restore --staged`). So IMO we 
should mimic what the command line UI does and not retain the i-t-a 
state when unstaging.

>     
>     Thoughts?

IMO this is a good start but more work needs to be done before we can 
call this feature finished.
 
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-699%2Fdscho%2Fgit-gui-stage-ita-hunks-and-lines-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-699/dscho/git-gui-stage-ita-hunks-and-lines-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/699
> 
>  git-gui.sh   |  2 ++
>  lib/diff.tcl | 12 ++++++++----
>  2 files changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/git-gui.sh b/git-gui.sh
> index 49bd86e635..e08cb17395 100755
> --- a/git-gui.sh
> +++ b/git-gui.sh
> @@ -2080,6 +2080,7 @@ set all_icons(U$ui_index)   file_merge
>  set all_icons(T$ui_index)   file_statechange
>  
>  set all_icons(_$ui_workdir) file_plain
> +set all_icons(A$ui_workdir) file_plain
>  set all_icons(M$ui_workdir) file_mod
>  set all_icons(D$ui_workdir) file_question
>  set all_icons(U$ui_workdir) file_merge
> @@ -2106,6 +2107,7 @@ foreach i {
>  		{A_ {mc "Staged for commit"}}
>  		{AM {mc "Portions staged for commit"}}
>  		{AD {mc "Staged for commit, missing"}}
> +		{AA {mc "Intended to be added"}}
>  
>  		{_D {mc "Missing"}}
>  		{D_ {mc "Staged for removal"}}
> diff --git a/lib/diff.tcl b/lib/diff.tcl
> index 871ad488c2..36d3715f7b 100644
> --- a/lib/diff.tcl
> +++ b/lib/diff.tcl
> @@ -582,7 +582,8 @@ proc apply_or_revert_hunk {x y revert} {
>  	if {$current_diff_side eq $ui_index} {
>  		set failed_msg [mc "Failed to unstage selected hunk."]
>  		lappend apply_cmd --reverse --cached
> -		if {[string index $mi 0] ne {M}} {
> +		set file_state [string index $mi 0]
> +		if {$file_state ne {M} && $file_state ne {A}} {
>  			unlock_index
>  			return
>  		}
> @@ -595,7 +596,8 @@ proc apply_or_revert_hunk {x y revert} {
>  			lappend apply_cmd --cached
>  		}
>  
> -		if {[string index $mi 1] ne {M}} {
> +		set file_state [string index $mi 1]
> +		if {$file_state ne {M} && $file_state ne {A}} {
>  			unlock_index
>  			return
>  		}
> @@ -687,7 +689,8 @@ proc apply_or_revert_range_or_line {x y revert} {
>  		set failed_msg [mc "Failed to unstage selected line."]
>  		set to_context {+}
>  		lappend apply_cmd --reverse --cached
> -		if {[string index $mi 0] ne {M}} {
> +		set file_state [string index $mi 0]
> +		if {$file_state ne {M} && $file_state ne {A}} {
>  			unlock_index
>  			return
>  		}
> @@ -702,7 +705,8 @@ proc apply_or_revert_range_or_line {x y revert} {
>  			lappend apply_cmd --cached
>  		}
>  
> -		if {[string index $mi 1] ne {M}} {
> +		set file_state [string index $mi 1]
> +		if {$file_state ne {M} && $file_state ne {A}} {
>  			unlock_index
>  			return
>  		}
> 

These changes look good to me to set up basic functionality. We just 
need to iron out the rough edges.

> base-commit: 469725c1a3d44f7e1475f1d37cd13e0824d4ea41
> -- 
> gitgitgadget

-- 
Regards,
Pratyush Yadav

^ permalink raw reply

* Re: [GSoC][PATCH] submodule: port submodule subcommand 'add' from shell to C
From: Kaartic Sivaraam @ 2020-08-26 10:54 UTC (permalink / raw)
  To: Shourya Shukla
  Cc: Junio C Hamano, git, christian.couder, johannes.schindelin,
	liu.denton, Prathamesh Chavan, Christian Couder, Stefan Beller
In-Reply-To: <CAP6+3T2FbjKc35QYiDmaezzKbkrxEOcBqzirm032_tTU2foZ=Q@mail.gmail.com>

On 26-08-2020 14:57, Shourya Shukla wrote:
> On 8/25/20, Kaartic Sivaraam <kaartic.sivaraam@gmail.com> wrote:
>>
>> This part results in a difference in error message in shell and C
>> versions.
>>
>> -- 8< --
>> $ # Shell version
>> $ git submodule add ../subm1 sub
>> A git directory for 'sub' is found locally with remote(s):
>>   origin        /me/subm1
>> If you want to reuse this local git directory instead of cloning again from
>>   /me/subm1
>> use the '--force' option. If the local git directory is not the correct
>> repo
>> or you are unsure what this means choose another name with the '--name'
>> option.
>> $
>> $ # C version
>> $ git submodule add ../subm1 sub
>> A git directory for 'sub' is found locally with remote(s):
>>   origin        /me/subm1
>> error: If you want to reuse this local git directory instead of cloning
>> again from
>>    /me/subm1
>> use the '--force' option. If the local git directory is not the correct
>> repo
>> or you are unsure what this means choose another name with the '--name'
>> option.
>> -- >8 --
>>
>> Note how the third line is oddly prefixed by a `error` unlike the rest
>> of the lines. It would be nice if we could weed out that inconsistency.
>> We could probably use `advise()` for printing the last four lines and
>> `error()` for the lines above them.
> 
> Understood. I will correct this part. BTW, you surely are talking
> about error() on
> the first 2 lines? I think fprintf(stderr, _()) is OK for them otherwise they
> will be prefixed by 'error:' which will not be in line with the shell version.
> 

Yes. It's better to prefix them with `error` because well... it is an
error. I realize the shell version didn't explicitly do this but that
doesn't necessarily mean the error message was helpful without the
prefix. AFAIK, many Git commands prefix their error messages with
`fatal` or `error` which makes it easy to distinguish error messages
from actual program output in the terminal. So, it's good to do the same
here.

-- 
Sivaraam

^ permalink raw reply

* Re: [PATCH 3/3] t7421: eliminate 'grep' check in t7421.4 for mingw compatibility
From: Shourya Shukla @ 2020-08-26 10:40 UTC (permalink / raw)
  To: Kaartic Sivaraam
  Cc: git, gitster, christian.couder, Johannes.Schindelin, peff,
	liu.denton
In-Reply-To: <2a1ea501-4974-4d74-fe3c-d173bbe76855@gmail.com>

On 25/08 08:03, Kaartic Sivaraam wrote:
> On 25-08-2020 17:00, Shourya Shukla wrote:
> > The 'grep' check in test 4 of t7421 resulted in the failure of t7421 on
> > Windows due to a different error message
> > 
> >     error: cannot spawn git: No such file or directory
> > 
> > instead of
> > 
> >     fatal: exec 'rev-parse': cd to 'my-subm' failed: No such file or directory
> > 
> > Tighten up the check to compute '{src,dst}_abbrev' by guarding the
> 
> The change only affects `src_abbrev`. So, it's misleading to mention
> `dst_abbrev` here.

I forgot to change that. Thank you for pointing this out.

> > 'verify_submodule_committish()' call using `p->status !='D'`, so that
> > the former isn't called in case of non-existent submodule directory,
> > consequently, there is no such error message on any execution
> > environment.
> > 
> > Therefore, eliminate the 'grep' check in t7421. Instead, verify the
> > absence of an error message by doing a 'test_must_be_empty' on the
> > file containing the error.
> > 
> > Reported-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
> > Helped-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
> > Mentored-by: Christian Couder <chriscool@tuxfamily.org>
> > Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
> > Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com>
> > ---
> >  builtin/submodule--helper.c      | 7 ++++---
> >  t/t7421-submodule-summary-add.sh | 2 +-
> >  2 files changed, 5 insertions(+), 4 deletions(-)
> > 
> > diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> > index 93d0700891..f1951680f7 100644
> > --- a/builtin/submodule--helper.c
> > +++ b/builtin/submodule--helper.c
> > @@ -1035,7 +1035,7 @@ static void print_submodule_summary(struct summary_cb *info, char *errmsg,
> >  static void generate_submodule_summary(struct summary_cb *info,
> >  				       struct module_cb *p)
> >  {
> > -	char *displaypath, *src_abbrev, *dst_abbrev;
> > +	char *displaypath, *src_abbrev = NULL, *dst_abbrev = NULL;
> 
> Unlike `src_abbrev`, I don't think we need to initilialize `dst_abbrev`
> to NULL here as it would be assigned in all code paths.

Alright. Changed!


^ permalink raw reply

* Re: [PATCH v3 09/11] commit-graph: use generation v2 only if entire chain does
From: Jakub Narębski @ 2020-08-26 10:38 UTC (permalink / raw)
  To: Abhishek Kumar
  Cc: git, Abhishek Kumar via GitGitGadget, Derrick Stolee, Taylor Blau,
	Jakub Narębski
In-Reply-To: <20200826071519.GA6805@Abhishek-Arch>

Hi Abhishek,

Abhishek Kumar <abhishekkumar8222@gmail.com> writes:
> On Sat, Aug 22, 2020 at 07:14:38PM +0200, Jakub Narębski wrote:
>> Hi Abhishek,
>>
>> ...
>>
>> However the commit message do not say anything about the *writing* side.
>>
>
> Revised the commit message to include the following at the end:
>
> When writing the new layer in split commit-graph, we write a GDAT chunk
> only if the topmost layer has a GDAT chunk. This guarantees that if a
> layer has GDAT chunk, all lower layers must have a GDAT chunk as well.
>

All right.

> Rewriting layers follows similar approach: if the topmost layer below
> set of layers being rewritten (in the split commit-graph chain) exists,
> and it does not contain GDAT chunk, then the result of rewrite does not
> have GDAT chunks either.

All right.

I see that you went with proposed more complex (but better) solution...

>>
>> ...
>>
>> To be more detailed, without '--split=replace' we would want the following
>> layer merging behavior:
>>
>>    [layer with GDAT][with GDAT][without GDAT][without GDAT][without GDAT]
>>            1              2           3             4            5
>>
>> In the split commit-graph chain above, merging two topmost layers
>> (layers 4 and 5) should create a layer without GDAT; merging three
>> topmost layers (and any other layers, e.g. two middle ones, i.e. 3 and
>> 4) should create a new layer with GDAT.

A simpler solution would be to create a new merged layer without GDAT if
any of the layers being merged do not have GDAT.

In this solution merging 3+4+5, 3+4, and even 2+3 would result with
layer without GDAT, and only merging 1+2 would result in layer with GDAT.

>>
>>    [layer with GDAT][with GDAT][without GDAT][-------without GDAT-------]
>>            1              2           3               merged
>>
>>    [layer with GDAT][with GDAT][-------------with GDAT------------------]
>>            1              2                    merged
>>
>> I hope those ASCII-art pictures help understanding it
>>
>
> Thanks! There were helpful.
>
> While we work as expected in the first scenario i.e merging 4 and 5, we
> would *still* write a layer without GDAT in the second scenario.
>
> I have tweaked split_graph_merge_strategy() to fix this:
>
> ----------------------------------------------
>
> diff --git a/commit-graph.c b/commit-graph.c
> index 6d54d9a286..246fad030d 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -1973,6 +1973,9 @@ static void split_graph_merge_strategy(struct write_commit_graph_context *ctx)
>  		}
>  	}
>
> +	if (!ctx->write_generation_data && g->chunk_generation_data)
> +		ctx->write_generation_data = 1;
> +
>  	if (flags != COMMIT_GRAPH_SPLIT_REPLACE)
>  		ctx->new_base_graph = g;
>  	else if (ctx->num_commit_graphs_after != 1)

...which turned out to be not that complicated.  Nice work!

Though this needs tests that if fulfills the stated condition (because I
am not sure if it is entirely correct: we are not checking the layer
below current one, isn't it?... ah, you explain it below).

One possible solution would be to grep `test-tool read-graph` output for
"^chunks: ", then pass it through `uniq` (without `sort`!), check that
the number of lines is less or equal 2, and if there are two lines then
check that we get the following contents:

  chunks: oid_fanout oid_lookup commit_metadata generation_data
  chunks: oid_fanout oid_lookup commit_metadata

(assuming that information about layers is added in top-down order).

This test must be run with GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=0, which
I think is the default.

> ----------------------------------------------------
>
> That is, if we were not writing generation data (because of mixed
> generation concerns) but the new topmost layer has a generation data
> chunk, we have merged all layers without GDAT chunk and can now write a
> GDAT chunk safely.

All right.

[...]
>>> diff --git a/commit-graph.h b/commit-graph.h
>>> index f78c892fc0..3cf89d895d 100644
>>> --- a/commit-graph.h
>>> +++ b/commit-graph.h
>>> @@ -63,6 +63,7 @@ struct commit_graph {
>>>  	struct object_directory *odb;
>>>
>>>  	uint32_t num_commits_in_base;
>>> +	uint32_t read_generation_data;
>>>  	struct commit_graph *base_graph;
>>>
>>
>> First, why `read_generation_data` is of uint32_t type, when it stores
>> (as far as I understand it), a "boolean" value of either 0 or 1?
>
> Yes, using unsigned int instead of uint32_t (although in most of cases
> it would be same).  If commit_graph had other flags as well, we could
> have used a bit field.

OK.

>> Second, couldn't we simply set chunk_generation_data to NULL?  Or would
>> that interfere with the case of rewriting, where we want to use existing
>> GDAT data when writing new commit-graph with GDAT chunk?
>
> It interferes with rewriting the split commit-graph, as you might have
> guessed from the above code snippet.

All right.

[...]
>>> @@ -885,6 +908,7 @@ void load_commit_graph_info(struct repository *r, struct commit *item)
>>>  	uint32_t pos;
>>>  	if (!prepare_commit_graph(r))
>>>  		return;
>>> +
>>>  	if (find_commit_in_graph(item, r->objects->commit_graph, &pos))
>>>  		fill_commit_graph_info(item, r->objects->commit_graph, pos);
>>>  }
>>
>> This is unrelated whitespace fix, a "while at it" in neighbourhood of
>> changes.  All right then.
>>
>
> Reverted this change, as it's unimportant.

Actually I am not against fixing the whitespace in the neighbourhood of
changes, so you can keep it or revert it (discard).

>>> @@ -2192,6 +2216,9 @@ int write_commit_graph(struct object_directory *odb,
>>
>> ...
>>
>> It would be nice to have an example with merging layers (whether we
>> would handle it in strict or relaxed way).
>>
>
> Sure, will add.

Thanks.


Best,
--
Jakub Narębski

^ permalink raw reply

* Re: [PATCH 2/3] submodule: fix style in function definition
From: Shourya Shukla @ 2020-08-26  9:45 UTC (permalink / raw)
  To: gitster
  Cc: Johannes.Schindelin, chriscool, christian.couder, git,
	kaartic.sivaraam, liu.denton, peff, shouryashukla.oo
In-Reply-To: <xmqqwo1mzc6y.fsf@gitster.c.googlers.com>

>> The definitions of 'verify_submodule_committish()' and
>> 'print_submodule_summary()' had wrong styling in terms of the asterisk
>> placement. Amend them.

> I pointed out only these two, but that does not necessarily mean
> they are the only ones.  Have you checked all the new code added by
> the series?

There is one more. It is not related to my patch series though. Here it
is:
----
static char *compute_rev_name(const char *sub_path, const char* object_id)
----
Would you like me to correct this one too?

>> Also, the warning printed in case of an unexpected file mode printed the
>> mode in decimal. Print it in octal for enhanced readability.

>I actually did check this side ;-) and am reasonably sure that there
>aren't any other irrational choice of format specifiers.

Sure! No worries!


^ permalink raw reply

* Re: [GSoC][PATCH] submodule: port submodule subcommand 'add' from shell to C
From: Shourya Shukla @ 2020-08-26  9:27 UTC (permalink / raw)
  To: Kaartic Sivaraam
  Cc: Junio C Hamano, git, christian.couder, johannes.schindelin,
	liu.denton, Prathamesh Chavan, Christian Couder, Stefan Beller
In-Reply-To: <43337924c09119d43c74fdad3f00d4dab76edb51.camel@gmail.com>

On 8/25/20, Kaartic Sivaraam <kaartic.sivaraam@gmail.com> wrote:
> On Mon, 2020-08-24 at 11:35 -0700, Junio C Hamano wrote:
>> Now it is a different question if the original is correct to begin
>> with ;-).
>>
>
> By looking at commit message of 619acfc78c (submodule add: extend force
> flag to add existing repos, 2016-10-06), I'm assuming it's correct.
> There are chances I might be missing something, though.
>
> Speaking of correctness, I'm surprised how the port passed the
> following test t7400.63 despite the incorrect check.
>
> -- 8< --
> $ ./t7400-submodule-basic.sh
> ... snip ...
> ok 62 - add submodules without specifying an explicit path
> ok 63 - add should fail when path is used by a file
> ok 64 - add should fail when path is used by an existing directory
> ... snip ...
> -- >8 --
>
> Most likely it passed because it slipped through the incorrect check
> and failed later in the code[1]. That's not good, of course.
>
>> > 	}
>> >
>> > Is this part correct? I am not very sure about this. This particular
>> > part is not covered in any test or test script, so, I do not have a
>> > solid method of knowing the correctness of this segment.
>> > Feedback and reviews are appreciated.
>
>
>> +static int add_submodule(struct add_data *info)
>> +{
>> +	/* perhaps the path exists and is already a git repo, else clone it */
>> +	if (is_directory(info->sm_path)) {
>> +		char *sub_git_path = xstrfmt("%s/.git", info->sm_path);
>> +		if (is_directory(sub_git_path) || file_exists(sub_git_path))
>> +			printf(_("Adding existing repo at '%s' to the index\n"),
>> +				 info->sm_path);
>> +		else
>> +			die(_("'%s' already exists and is not a valid git repo"),
>> +			      info->sm_path);
>> +		free(sub_git_path);
>> +	} else {
>> +		struct strvec clone_args = STRVEC_INIT;
>> +		struct child_process cp = CHILD_PROCESS_INIT;
>> +		char *submodule_git_dir = xstrfmt(".git/modules/%s", info->sm_name);
>> +
>> +		if (is_directory(submodule_git_dir)) {
>> +			if (!info->force) {
>> +				struct child_process cp_rem = CHILD_PROCESS_INIT;
>> +				struct strbuf sb_rem = STRBUF_INIT;
>> +				cp_rem.git_cmd = 1;
>> +				fprintf(stderr, _("A git directory for '%s' is "
>> +					"found locally with remote(s):\n"),
>> +					info->sm_name);
>> +				strvec_pushf(&cp_rem.env_array,
>> +					     "GIT_DIR=%s", submodule_git_dir);
>> +				strvec_push(&cp_rem.env_array, "GIT_WORK_TREE=.");
>> +				strvec_pushl(&cp_rem.args, "remote", "-v", NULL);
>> +				if (!capture_command(&cp_rem, &sb_rem, 0)) {
>> +					modify_remote_v(&sb_rem);
>> +				}
>> +				error(_("If you want to reuse this local git "
>> +				      "directory instead of cloning again from\n "
>> +				      "  %s\n"
>> +				      "use the '--force' option. If the local "
>> +				      "git directory is not the correct repo\n"
>> +				      "or you are unsure what this means choose "
>> +				      "another name with the '--name' option."),
>> +				      info->realrepo);
>> +				return 1;
>> +			} else {
>> +				printf(_("Reactivating local git directory for "
>> +					 "submodule '%s'."), info->sm_path);
>> +			}
>> +		}
>> +		free(submodule_git_dir);
>
> This part results in a difference in error message in shell and C
> versions.
>
> -- 8< --
> $ # Shell version
> $ git submodule add ../subm1 sub
> A git directory for 'sub' is found locally with remote(s):
>   origin        /me/subm1
> If you want to reuse this local git directory instead of cloning again from
>   /me/subm1
> use the '--force' option. If the local git directory is not the correct
> repo
> or you are unsure what this means choose another name with the '--name'
> option.
> $
> $ # C version
> $ git submodule add ../subm1 sub
> A git directory for 'sub' is found locally with remote(s):
>   origin        /me/subm1
> error: If you want to reuse this local git directory instead of cloning
> again from
>    /me/subm1
> use the '--force' option. If the local git directory is not the correct
> repo
> or you are unsure what this means choose another name with the '--name'
> option.
> -- >8 --
>
> Note how the third line is oddly prefixed by a `error` unlike the rest
> of the lines. It would be nice if we could weed out that inconsistency.
> We could probably use `advise()` for printing the last four lines and
> `error()` for the lines above them.

Understood. I will correct this part. BTW, you surely are talking
about error() on
the first 2 lines? I think fprintf(stderr, _()) is OK for them otherwise they
will be prefixed by 'error:' which will not be in line with the shell version.

^ permalink raw reply

* Re: [GSoC][PATCH] submodule: port submodule subcommand 'add' from shell to C
From: Shourya Shukla @ 2020-08-26  9:15 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, christian.couder, johannes.schindelin, liu.denton,
	kaartic.sivaraam
In-Reply-To: <xmqq8se36gev.fsf@gitster.c.googlers.com>

On 24/08 11:35, Junio C Hamano wrote:
> Shourya Shukla <shouryashukla.oo@gmail.com> writes:
> 
> > 	if test -z "$force"
> > 	then
> > 		git ls-files --error-unmatch "$sm_path" > /dev/null 2>&1 &&
> > 		die "$(eval_gettext "'\$sm_path' already exists in the index")"
> > 	else
> > 		git ls-files -s "$sm_path" | sane_grep -v "^160000" > /dev/null 2>&1 &&
> > 		die "$(eval_gettext "'\$sm_path' already exists in the index and is not a submodule")"
> > 	fi
> 
> Hmph.  So,
> 
>  - if we are not being 'force'd, we see if there is anything in the
>    index for the path and error out, whether it is a gitlink or not.
> 
>  - if there is 'force' option, we see what the given path is in the
>    index, and if it is already a gitlink, then die.  That sort of
>    makes sense, as long as the remainder of the code deals with the
>    path that is not a submodule in a sensible way.
> 
> > This is what I have done in C:
> >
> > 	if (!force) {
> > 		if (is_directory(path) && submodule_from_path(the_repository, &null_oid, path))
> > 			die(_("'%s' already exists in the index"), path);
> 
> The shell version would error out with anything in the index, so I'd
> expect that a faithful conversion would not call is_directory() nor
> submodule_from_path() at all---it would just look path up in the_index
> and complains if anything is found.  For example, the quoted part in
> the original above is what gives the error message when I do
> 
> 	$ git submodule add ./Makefile
> 	'Makefile' already exists in the index.
> 
> I think.  And the above code won't trigger the "already exists" at
> all because 'path' is not a directory.

Alright. That is correct. I tried to use a multitude of functions but
did not find luck with any of them. The functions I tried:

    - index_path() to check if the path is in the index. For some
      reason, it switched to the 'default' case and return the
      'unsupported file type' error.

    - A combination of doing an OR with index_file_exists() and
      index_dir_exists(). Still no luck. t7406.43 fails.

    - Using index_name_pos() along with the above two functions. Again a
      failure in the same test.

I feel that index_name_pos() should suffice this task but it fails in
t7406.43. The SM is in index since 'git ls-files --error-unmatch s1'
does return 's1' (s1 is the submodule). What am I missing here?

> > 	} else {
> > 		int err;
> > 		if (index_name_pos(&the_index, path, strlen(path)) >= 0 &&
> > 		    !is_submodule_populated_gently(path, &err))
> > 			die(_("'%s' already exists in the index and is not a "
> > 			      "submodule"), path);
>
> Likewise.  The above does much more than the original.
>
> The original was checking if the found cache entry has 160000 mode
> bit, so the second test would not be is_submodule_populated_gently()
> but more like !S_ISGITLINK(ce->ce_mode)

Using this results in failure of t7506.[33-40]. I implemented this in
two ways:

    1. Use stat() to initialise the stat st corresponding to the 'path'.
       Then do a '!S_ISGITLINK(st.st_mode)'.

    2. Run a for loop:
		for (i = 0; i < active_nr; i++) {
		const struct cache_entry *ce = active_cache[i];

		if (index_name_pos(&the_index, path, strlen(path)) >= 0 &&
		    !S_ISGITLINK(ce->ce_mode))
			die(_("'%s' already exists in the index and is not a "
			      "submodule"), path);
        }

        Still the tests failed. What is meant by 'active_nr' BTW? I am
        not aware of this term.

Where am I going wrong for both the if-cases?


^ permalink raw reply

* Re: [PATCH v3 09/11] commit-graph: use generation v2 only if entire chain does
From: Abhishek Kumar @ 2020-08-26  7:15 UTC (permalink / raw)
  To: Jakub Narębski; +Cc: abhishekkumar8222, git, gitgitgadget, me, stolee
In-Reply-To: <85pn7ihabl.fsf@gmail.com>

On Sat, Aug 22, 2020 at 07:14:38PM +0200, Jakub Narębski wrote:
> Hi Abhishek,
> 
> ... 
> 
> However the commit message do not say anything about the *writing* side.
> 

Revised the commit message to include the following at the end:

When writing the new layer in split commit-graph, we write a GDAT chunk
only if the topmost layer has a GDAT chunk. This guarantees that if a
layer has GDAT chunk, all lower layers must have a GDAT chunk as well.

Rewriting layers follows similar approach: if the topmost layer below
set of layers being rewritten (in the split commit-graph chain) exists,
and it does not contain GDAT chunk, then the result of rewrite does not
have GDAT chunks either.

> 
> ...
> 
> To be more detailed, without '--split=replace' we would want the following
> layer merging behavior:
> 
>    [layer with GDAT][with GDAT][without GDAT][without GDAT][without GDAT]
>            1              2           3             4            5
> 
> In the split commit-graph chain above, merging two topmost layers
> (layers 4 and 5) should create a layer without GDAT; merging three
> topmost layers (and any other layers, e.g. two middle ones, i.e. 3 and
> 4) should create a new layer with GDAT.
> 
>    [layer with GDAT][with GDAT][without GDAT][-------without GDAT-------]
>            1              2           3               merged
> 
>    [layer with GDAT][with GDAT][-------------with GDAT------------------]
>            1              2                    merged
> 
> I hope those ASCII-art pictures help understanding it
> 

Thanks! There were helpful.

While we work as expected in the first scenario i.e merging 4 and 5, we
would *still* write a layer without GDAT in the second scenario.

I have tweaked split_graph_merge_strategy() to fix this:

----------------------------------------------

diff --git a/commit-graph.c b/commit-graph.c
index 6d54d9a286..246fad030d 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -1973,6 +1973,9 @@ static void split_graph_merge_strategy(struct write_commit_graph_context *ctx)
 		}
 	}
 
+	if (!ctx->write_generation_data && g->chunk_generation_data)
+		ctx->write_generation_data = 1;
+
 	if (flags != COMMIT_GRAPH_SPLIT_REPLACE)
 		ctx->new_base_graph = g;
 	else if (ctx->num_commit_graphs_after != 1)

----------------------------------------------------

That is, if we were not writing generation data (because of mixed
generation concerns) but the new topmost layer has a generation data
chunk, we have merged all layers without GDAT chunk and can now write a
GDAT chunk safely.

> >
> > It is difficult to expose this issue in a test. Since we _start_ with
> > artificially low generation numbers, any commit walk that prioritizes
> > generation numbers will walk all of the commits with high generation
> > number before walking the commits with low generation number. In all the
> > cases I tried, the commit-graph layers themselves "protect" any
> > incorrect behavior since none of the commits in the lower layer can
> > reach the commits in the upper layer.
> >
> > This issue would manifest itself as a performance problem in this case,
> > especially with something like "git log --graph" since the low
> > generation numbers would cause the in-degree queue to walk all of the
> > commits in the lower layer before allowing the topo-order queue to write
> > anything to output (depending on the size of the upper layer).
> 
> Wouldn't breaking the reachability condition promise make some Git
> commands to return *incorrect* results if they short-circuit, stop
> walking if generation number shows that A cannot reach B?
> 
> I am talking here about commands that return boolean, or select subset
> from given set of revisions:
> - git merge-base --is-ancestor <B> <A>
> - git branch branch-A <A> && git branch --contains <B>
> - git branch branch-B <B> && git branch --merged <A>
> 
> Git assumes that generation numbers fulfill the following condition:
> 
>   if A can reach B, then gen(A) > gen(B)
> 
> Notably this includes commits not in commit-graph, and clamped values.
> 
> However, in the following case
> 
> * if commit A is from higher layer without GDAT
>   and uses topological levels for 'generation', e.g. 115 (in a small repo)
> * and commit B is from lower layer with GDAT
>   and uses corrected commit date as 'generation', for example 1598112896,
> 
> it may happen that A (later commit) can reach B (earlier commit), but
> gen(B) > gen(A).  The reachability condition promise for generation
> numbers is broken.
> 
> >
> > Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> > Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
> > ---
> 
> I have reordered files in the patch itself to make it easier to review
> the proposed changes.
> 
> >  commit-graph.h                |  1 +
> >  commit-graph.c                | 32 +++++++++++++++-
> >  t/t5324-split-commit-graph.sh | 70 +++++++++++++++++++++++++++++++++++
> >  3 files changed, 102 insertions(+), 1 deletion(-)
> >
> > diff --git a/commit-graph.h b/commit-graph.h
> > index f78c892fc0..3cf89d895d 100644
> > --- a/commit-graph.h
> > +++ b/commit-graph.h
> > @@ -63,6 +63,7 @@ struct commit_graph {
> >  	struct object_directory *odb;
> >
> >  	uint32_t num_commits_in_base;
> > +	uint32_t read_generation_data;
> >  	struct commit_graph *base_graph;
> >
> 
> First, why `read_generation_data` is of uint32_t type, when it stores
> (as far as I understand it), a "boolean" value of either 0 or 1?
> 

Yes, using unsigned int instead of uint32_t (although in most of cases
it would be same).  If commit_graph had other flags as well, we could
have used a bit field.

> Second, couldn't we simply set chunk_generation_data to NULL?  Or would
> that interfere with the case of rewriting, where we want to use existing
> GDAT data when writing new commit-graph with GDAT chunk?

It interferes with rewriting the split commit-graph, as you might have
guessed from the above code snippet.

> 
> ...
>
> > diff --git a/commit-graph.c b/commit-graph.c
> 
> >  		graph_data->generation = item->date +
> >  			(timestamp_t) get_be32(g->chunk_generation_data + sizeof(uint32_t) * lex_index);
> >  	else
> > @@ -885,6 +908,7 @@ void load_commit_graph_info(struct repository *r, struct commit *item)
> >  	uint32_t pos;
> >  	if (!prepare_commit_graph(r))
> >  		return;
> > +
> >  	if (find_commit_in_graph(item, r->objects->commit_graph, &pos))
> >  		fill_commit_graph_info(item, r->objects->commit_graph, pos);
> >  }
> 
> This is unrelated whitespace fix, a "while at it" in neighbourhood of
> changes.  All right then.
> 

Reverted this change, as it's unimportant.

> > @@ -2192,6 +2216,9 @@ int write_commit_graph(struct object_directory *odb,
>
> ...
> 
> It would be nice to have an example with merging layers (whether we
> would handle it in strict or relaxed way).
> 

Sure, will add.

> > +
> >  test_done
> 
> Best,
> -- 
> Jakub Narębski

Thanks
- Abhishek

^ permalink raw reply related

* DONATION
From: Richard Wahl @ 2020-08-26  6:43 UTC (permalink / raw)
  To: git

Hello git,

you have a donation of $3,000,000.00 ( 3 million dollars).
My name is Richard Wahl from the united states. I won the America 
lottery worth $533 million and I am donating a portion of it to 
just 10 lucky people and a few Orphanage homes as a memorandum of 
goodwill to humanity. and also as a way of assistance over the 
COVID 19 Pandemic.
If you are a recipient of this mail git@vger.kernel.org  contact 
me on  adelenebreton@gmail.com for more details and claim. I may 
be very busy but I will take out time to respond to you.

^ permalink raw reply

* Re: [PATCH v2 2/2] ref-filter: 'contents:trailers' show error if `:` is missing
From: Christian Couder @ 2020-08-26  6:22 UTC (permalink / raw)
  To: Hariom verma
  Cc: Eric Sunshine, Junio C Hamano, Hariom Verma via GitGitGadget,
	Git List, Heba Waly
In-Reply-To: <CAP8UFD03Am94_84FvRPxEdt_AG74864eQ4TimggKtUYWjJYqCg@mail.gmail.com>

On Wed, Aug 26, 2020 at 8:18 AM Christian Couder
<christian.couder@gmail.com> wrote:

> I think we could also get rid of the "match_and_" part of the
> suggestion, in the same way as skip_prefix() is not called
> match_and_skip_prefix(). Readers can just expect that if there is no
> match the function will return 0.
>
> So maybe "extract_field_option()".

If we want to hint more that it works in the way as skip_prefix(), we
could call it "skip_field()".

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox