* Regression in 8d92fb292706 ("dir: replace exponential algorithm with a linear one")
@ 2021-12-06 12:16 Uwe Kleine-König
2021-12-07 2:19 ` brian m. carlson
2021-12-07 5:29 ` Elijah Newren
0 siblings, 2 replies; 4+ messages in thread
From: Uwe Kleine-König @ 2021-12-06 12:16 UTC (permalink / raw)
To: git; +Cc: Elijah Newren, Björn Lässig
[-- Attachment #1: Type: text/plain, Size: 2492 bytes --]
Hello,
I admit this is somewhat of a corner case, still it happens in the
reality of our admin team ...
Initially this was noticed after upgrading the OS from Debian buster
(with git 2.20.1) to Debian bullseye (with git 2.30.2).
(wgit is just a wrapper for git to call it from my ~/src/git.)
This is the good ("old") case:
uwe@taurus:~/tmp/8d92fb29270$ wgit version
git version 2.25.2.7.g0bbd0e8b5233
uwe@taurus:~/tmp/8d92fb292706$ wgit init
Initialized empty Git repository in /home/uwe/tmp/8d92fb292706/.git/
uwe@taurus:~/tmp/8d92fb292706$ mkdir subdir
uwe@taurus:~/tmp/8d92fb292706$ cd subdir/
uwe@taurus:~/tmp/8d92fb292706/subdir$ wgit init
Initialized empty Git repository in /home/uwe/tmp/8d92fb292706/subdir/.git/
uwe@taurus:~/tmp/8d92fb292706/subdir$ cd ..
uwe@taurus:~/tmp/8d92fb292706$ echo content > subdir/somefile
uwe@taurus:~/tmp/8d92fb292706$ wgit add subdir/somefile
uwe@taurus:~/tmp/8d92fb292706$ wgit status
On branch master
No commits yet
Changes to be committed:
(use "git rm --cached <file>..." to unstage)
new file: subdir/somefile
with 8d92fb292706, the following happens:
uwe@taurus:~/tmp/8d92fb292706$ wgit version
git version 2.25.2.8.g8d92fb292706
uwe@taurus:~/tmp/8d92fb292706$ wgit init
Initialized empty Git repository in /home/uwe/tmp/8d92fb292706/.git/
uwe@taurus:~/tmp/8d92fb292706$ mkdir subdir
uwe@taurus:~/tmp/8d92fb292706$ cd subdir/
uwe@taurus:~/tmp/8d92fb292706/subdir$ wgit init
Initialized empty Git repository in /home/uwe/tmp/8d92fb292706/subdir/.git/
uwe@taurus:~/tmp/8d92fb292706/subdir$ cd ..
uwe@taurus:~/tmp/8d92fb292706$ echo content > subdir/somefile
uwe@taurus:~/tmp/8d92fb292706$ wgit add subdir/somefile
uwe@taurus:~/tmp/8d92fb292706$ wgit status
On branch master
No commits yet
Untracked files:
(use "git add <file>..." to include in what will be committed)
subdir/
nothing added to commit but untracked files present (use "git add" to track)
So git after 8d92fb292706 doesn't add files from a subdirectory if
said subdirectory is tracked in git, too.
While I'm not sure which of the two behaviours is the bogus one, this is
a change in behaviour that I guess wasn't intended in 8d92fb292706.
Is this something that needs fixing?
Best regards
Uwe
--
Pengutronix e.K. | Uwe Kleine-König |
Industrial Linux Solutions | https://www.pengutronix.de/ |
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Regression in 8d92fb292706 ("dir: replace exponential algorithm with a linear one")
2021-12-06 12:16 Regression in 8d92fb292706 ("dir: replace exponential algorithm with a linear one") Uwe Kleine-König
@ 2021-12-07 2:19 ` brian m. carlson
2021-12-07 5:29 ` Elijah Newren
1 sibling, 0 replies; 4+ messages in thread
From: brian m. carlson @ 2021-12-07 2:19 UTC (permalink / raw)
To: Uwe Kleine-König; +Cc: git, Elijah Newren, Björn Lässig
[-- Attachment #1: Type: text/plain, Size: 765 bytes --]
On 2021-12-06 at 12:16:39, Uwe Kleine-König wrote:
> Is this something that needs fixing?
I don't think so. The old behavior was to add a file that's under a
different repository (subdir), which is bizarre and at least to me,
unexpected. I believe the new behavior causes us to not add the file
because it belongs to a different repo, which seems correct.
I have no clue how we could practically check out the file in the first
situation if subdir remains a git repository. Even if it does work, it
is likely to lead to confusing behavior when the index for subdir is not
in sync with the index for the main repo.
If I'm missing something here, though, please do point it out.
--
brian m. carlson (he/him or they/them)
Toronto, Ontario, CA
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Regression in 8d92fb292706 ("dir: replace exponential algorithm with a linear one")
2021-12-06 12:16 Regression in 8d92fb292706 ("dir: replace exponential algorithm with a linear one") Uwe Kleine-König
2021-12-07 2:19 ` brian m. carlson
@ 2021-12-07 5:29 ` Elijah Newren
2021-12-07 7:14 ` Uwe Kleine-König
1 sibling, 1 reply; 4+ messages in thread
From: Elijah Newren @ 2021-12-07 5:29 UTC (permalink / raw)
To: Uwe Kleine-König; +Cc: Git Mailing List, Björn Lässig
On Mon, Dec 6, 2021 at 4:16 AM Uwe Kleine-König
<u.kleine-koenig@pengutronix.de> wrote:
>
> Hello,
>
> I admit this is somewhat of a corner case, still it happens in the
> reality of our admin team ...
> Initially this was noticed after upgrading the OS from Debian buster
> (with git 2.20.1) to Debian bullseye (with git 2.30.2).
>
> (wgit is just a wrapper for git to call it from my ~/src/git.)
>
> This is the good ("old") case:
>
> uwe@taurus:~/tmp/8d92fb29270$ wgit version
> git version 2.25.2.7.g0bbd0e8b5233
>
> uwe@taurus:~/tmp/8d92fb292706$ wgit init
> Initialized empty Git repository in /home/uwe/tmp/8d92fb292706/.git/
>
> uwe@taurus:~/tmp/8d92fb292706$ mkdir subdir
> uwe@taurus:~/tmp/8d92fb292706$ cd subdir/
> uwe@taurus:~/tmp/8d92fb292706/subdir$ wgit init
> Initialized empty Git repository in /home/uwe/tmp/8d92fb292706/subdir/.git/
> uwe@taurus:~/tmp/8d92fb292706/subdir$ cd ..
>
> uwe@taurus:~/tmp/8d92fb292706$ echo content > subdir/somefile
> uwe@taurus:~/tmp/8d92fb292706$ wgit add subdir/somefile
> uwe@taurus:~/tmp/8d92fb292706$ wgit status
> On branch master
>
> No commits yet
>
> Changes to be committed:
> (use "git rm --cached <file>..." to unstage)
> new file: subdir/somefile
Eek, that's bad. I think there's a number of dragons going down that route.
> with 8d92fb292706, the following happens:
>
> uwe@taurus:~/tmp/8d92fb292706$ wgit version
> git version 2.25.2.8.g8d92fb292706
> uwe@taurus:~/tmp/8d92fb292706$ wgit init
> Initialized empty Git repository in /home/uwe/tmp/8d92fb292706/.git/
> uwe@taurus:~/tmp/8d92fb292706$ mkdir subdir
> uwe@taurus:~/tmp/8d92fb292706$ cd subdir/
> uwe@taurus:~/tmp/8d92fb292706/subdir$ wgit init
> Initialized empty Git repository in /home/uwe/tmp/8d92fb292706/subdir/.git/
> uwe@taurus:~/tmp/8d92fb292706/subdir$ cd ..
> uwe@taurus:~/tmp/8d92fb292706$ echo content > subdir/somefile
> uwe@taurus:~/tmp/8d92fb292706$ wgit add subdir/somefile
Not optimal; more on this below.
> uwe@taurus:~/tmp/8d92fb292706$ wgit status
> On branch master
>
> No commits yet
>
> Untracked files:
> (use "git add <file>..." to include in what will be committed)
> subdir/
>
> nothing added to commit but untracked files present (use "git add" to track)
This part looks good to me.
> So git after 8d92fb292706 doesn't add files from a subdirectory if
> said subdirectory is tracked in git, too.
>
> While I'm not sure which of the two behaviours is the bogus one, this is
> a change in behaviour that I guess wasn't intended in 8d92fb292706.
I put some effort separate from that commit into avoiding accidentally
recursing into nested git dirs; see e.g. commit 09487f2cba ("clean:
avoid removing untracked files in a nested git repository",
2019-09-17). So, I was slightly surprised that some other commit
hadn't fixed this.
However, it's not all that surprising to me that 8d92fb292706 affected
this. Prior to that commit, we visited untracked paths which were n
directories deep a ridiculous 2^n times. But what made it even more
fun was that the status returned for any given path (tracked, ignored,
not interesting to the traversal, etc.) was not always the same; later
traversals might return something different than earlier traversals.
That confusion made it real "fun" trying to ensure no regressions when
reducing the number of visits to any given path from 2^n down to 1.
The fact that side effects of the traversals (the population of the
dir.entries and dir.ignored) could have essentially functioned to
override a later traversal's return status certainly didn't help; it
was such a mess.
But, interestingly, the fixed behavior here also depends pretty
strongly on commit b9670c1f5e ("dir: fix checks on common prefix
directory", 2019-12-19) which came months earlier. This is
particularly important in combination with the following comment from
dir.h:
/**
* If set, recurse into a directory that looks like a Git directory.
* Otherwise it is shown as a directory.
*/
DIR_NO_GITLINKS = 1<<3,
which suggests that cmd_add() (which didn't set this flag) should have
never been recursing into a Git-tracked directory. In other words,
this was a bug all along.
> Is this something that needs fixing?
I agree with brian elsewhere in this thread that not adding the file
is correct. However, two points:
* I would prefer to see a warning/error from git add when it doesn't
add a path (Any takers? #leftoverbits maybe?)
* It is possible that one might want to be able to force the addition
of files to an outer repository despite existing within a directory
tracked by an inner git repository, perhaps with a double `--force`
being passed to git-add (much like git-clean allows). If so, that
could be implemented via the addition of
dir.flags |= DIR_NO_GITLINKS;
to cmd_add() when the double force is detected.
Hope that helps,
Elijah
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Regression in 8d92fb292706 ("dir: replace exponential algorithm with a linear one")
2021-12-07 5:29 ` Elijah Newren
@ 2021-12-07 7:14 ` Uwe Kleine-König
0 siblings, 0 replies; 4+ messages in thread
From: Uwe Kleine-König @ 2021-12-07 7:14 UTC (permalink / raw)
To: Elijah Newren; +Cc: Git Mailing List, Björn Lässig
[-- Attachment #1: Type: text/plain, Size: 6222 bytes --]
Hello,
first of all thanks for addressing my report, also to brian.
On Mon, Dec 06, 2021 at 09:29:59PM -0800, Elijah Newren wrote:
> On Mon, Dec 6, 2021 at 4:16 AM Uwe Kleine-König
> <u.kleine-koenig@pengutronix.de> wrote:
> > I admit this is somewhat of a corner case, still it happens in the
> > reality of our admin team ...
> > Initially this was noticed after upgrading the OS from Debian buster
> > (with git 2.20.1) to Debian bullseye (with git 2.30.2).
> >
> > (wgit is just a wrapper for git to call it from my ~/src/git.)
> >
> > This is the good ("old") case:
> >
> > uwe@taurus:~/tmp/8d92fb29270$ wgit version
> > git version 2.25.2.7.g0bbd0e8b5233
> >
> > uwe@taurus:~/tmp/8d92fb292706$ wgit init
> > Initialized empty Git repository in /home/uwe/tmp/8d92fb292706/.git/
> >
> > uwe@taurus:~/tmp/8d92fb292706$ mkdir subdir
> > uwe@taurus:~/tmp/8d92fb292706$ cd subdir/
> > uwe@taurus:~/tmp/8d92fb292706/subdir$ wgit init
> > Initialized empty Git repository in /home/uwe/tmp/8d92fb292706/subdir/.git/
> > uwe@taurus:~/tmp/8d92fb292706/subdir$ cd ..
> >
> > uwe@taurus:~/tmp/8d92fb292706$ echo content > subdir/somefile
> > uwe@taurus:~/tmp/8d92fb292706$ wgit add subdir/somefile
> > uwe@taurus:~/tmp/8d92fb292706$ wgit status
> > On branch master
> >
> > No commits yet
> >
> > Changes to be committed:
> > (use "git rm --cached <file>..." to unstage)
> > new file: subdir/somefile
>
> Eek, that's bad. I think there's a number of dragons going down that route.
Yes, as soon as you start to checkout files in one of the involved
repositories you implicitly change the other one. In this case git is
only used to track files in /etc and elsewhere, and there are two
mechanisms to track them. (Don't ask for the reasons, I don't know them.
:-) In this setup (I think) the dragons should be well fixed to their
chains.
> > with 8d92fb292706, the following happens:
> >
> > uwe@taurus:~/tmp/8d92fb292706$ wgit version
> > git version 2.25.2.8.g8d92fb292706
> > uwe@taurus:~/tmp/8d92fb292706$ wgit init
> > Initialized empty Git repository in /home/uwe/tmp/8d92fb292706/.git/
> > uwe@taurus:~/tmp/8d92fb292706$ mkdir subdir
> > uwe@taurus:~/tmp/8d92fb292706$ cd subdir/
> > uwe@taurus:~/tmp/8d92fb292706/subdir$ wgit init
> > Initialized empty Git repository in /home/uwe/tmp/8d92fb292706/subdir/.git/
> > uwe@taurus:~/tmp/8d92fb292706/subdir$ cd ..
> > uwe@taurus:~/tmp/8d92fb292706$ echo content > subdir/somefile
> > uwe@taurus:~/tmp/8d92fb292706$ wgit add subdir/somefile
>
> Not optimal; more on this below.
>
> > uwe@taurus:~/tmp/8d92fb292706$ wgit status
> > On branch master
> >
> > No commits yet
> >
> > Untracked files:
> > (use "git add <file>..." to include in what will be committed)
> > subdir/
> >
> > nothing added to commit but untracked files present (use "git add" to track)
>
> This part looks good to me.
>
> > So git after 8d92fb292706 doesn't add files from a subdirectory if
> > said subdirectory is tracked in git, too.
> >
> > While I'm not sure which of the two behaviours is the bogus one, this is
> > a change in behaviour that I guess wasn't intended in 8d92fb292706.
>
> I put some effort separate from that commit into avoiding accidentally
> recursing into nested git dirs; see e.g. commit 09487f2cba ("clean:
> avoid removing untracked files in a nested git repository",
> 2019-09-17). So, I was slightly surprised that some other commit
> hadn't fixed this.
>
> However, it's not all that surprising to me that 8d92fb292706 affected
> this. Prior to that commit, we visited untracked paths which were n
> directories deep a ridiculous 2^n times. But what made it even more
> fun was that the status returned for any given path (tracked, ignored,
> not interesting to the traversal, etc.) was not always the same; later
> traversals might return something different than earlier traversals.
> That confusion made it real "fun" trying to ensure no regressions when
> reducing the number of visits to any given path from 2^n down to 1.
> The fact that side effects of the traversals (the population of the
> dir.entries and dir.ignored) could have essentially functioned to
> override a later traversal's return status certainly didn't help; it
> was such a mess.
>
> But, interestingly, the fixed behavior here also depends pretty
> strongly on commit b9670c1f5e ("dir: fix checks on common prefix
> directory", 2019-12-19) which came months earlier. This is
> particularly important in combination with the following comment from
> dir.h:
> /**
> * If set, recurse into a directory that looks like a Git directory.
> * Otherwise it is shown as a directory.
> */
> DIR_NO_GITLINKS = 1<<3,
> which suggests that cmd_add() (which didn't set this flag) should have
> never been recursing into a Git-tracked directory. In other words,
> this was a bug all along.
>
> > Is this something that needs fixing?
>
> I agree with brian elsewhere in this thread that not adding the file
> is correct. However, two points:
>
> * I would prefer to see a warning/error from git add when it doesn't
> add a path (Any takers? #leftoverbits maybe?)
Agreed, also maybe return an exit code != 0?
> * It is possible that one might want to be able to force the addition
> of files to an outer repository despite existing within a directory
> tracked by an inner git repository, perhaps with a double `--force`
> being passed to git-add (much like git-clean allows). If so, that
> could be implemented via the addition of
> dir.flags |= DIR_NO_GITLINKS;
> to cmd_add() when the double force is detected.
I think this would be good.
Best regards
Uwe
--
Pengutronix e.K. | Uwe Kleine-König |
Industrial Linux Solutions | https://www.pengutronix.de/ |
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2021-12-07 7:14 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-12-06 12:16 Regression in 8d92fb292706 ("dir: replace exponential algorithm with a linear one") Uwe Kleine-König
2021-12-07 2:19 ` brian m. carlson
2021-12-07 5:29 ` Elijah Newren
2021-12-07 7:14 ` Uwe Kleine-König
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).