* Regression in 8d92fb292706 ("dir: replace exponential algorithm with a linear one") @ 2021-12-06 12:16 Uwe Kleine-König 2021-12-07 2:19 ` brian m. carlson 2021-12-07 5:29 ` Elijah Newren 0 siblings, 2 replies; 4+ messages in thread From: Uwe Kleine-König @ 2021-12-06 12:16 UTC (permalink / raw) To: git; +Cc: Elijah Newren, Björn Lässig [-- Attachment #1: Type: text/plain, Size: 2492 bytes --] Hello, I admit this is somewhat of a corner case, still it happens in the reality of our admin team ... Initially this was noticed after upgrading the OS from Debian buster (with git 2.20.1) to Debian bullseye (with git 2.30.2). (wgit is just a wrapper for git to call it from my ~/src/git.) This is the good ("old") case: uwe@taurus:~/tmp/8d92fb29270$ wgit version git version 2.25.2.7.g0bbd0e8b5233 uwe@taurus:~/tmp/8d92fb292706$ wgit init Initialized empty Git repository in /home/uwe/tmp/8d92fb292706/.git/ uwe@taurus:~/tmp/8d92fb292706$ mkdir subdir uwe@taurus:~/tmp/8d92fb292706$ cd subdir/ uwe@taurus:~/tmp/8d92fb292706/subdir$ wgit init Initialized empty Git repository in /home/uwe/tmp/8d92fb292706/subdir/.git/ uwe@taurus:~/tmp/8d92fb292706/subdir$ cd .. uwe@taurus:~/tmp/8d92fb292706$ echo content > subdir/somefile uwe@taurus:~/tmp/8d92fb292706$ wgit add subdir/somefile uwe@taurus:~/tmp/8d92fb292706$ wgit status On branch master No commits yet Changes to be committed: (use "git rm --cached <file>..." to unstage) new file: subdir/somefile with 8d92fb292706, the following happens: uwe@taurus:~/tmp/8d92fb292706$ wgit version git version 2.25.2.8.g8d92fb292706 uwe@taurus:~/tmp/8d92fb292706$ wgit init Initialized empty Git repository in /home/uwe/tmp/8d92fb292706/.git/ uwe@taurus:~/tmp/8d92fb292706$ mkdir subdir uwe@taurus:~/tmp/8d92fb292706$ cd subdir/ uwe@taurus:~/tmp/8d92fb292706/subdir$ wgit init Initialized empty Git repository in /home/uwe/tmp/8d92fb292706/subdir/.git/ uwe@taurus:~/tmp/8d92fb292706/subdir$ cd .. uwe@taurus:~/tmp/8d92fb292706$ echo content > subdir/somefile uwe@taurus:~/tmp/8d92fb292706$ wgit add subdir/somefile uwe@taurus:~/tmp/8d92fb292706$ wgit status On branch master No commits yet Untracked files: (use "git add <file>..." to include in what will be committed) subdir/ nothing added to commit but untracked files present (use "git add" to track) So git after 8d92fb292706 doesn't add files from a subdirectory if said subdirectory is tracked in git, too. While I'm not sure which of the two behaviours is the bogus one, this is a change in behaviour that I guess wasn't intended in 8d92fb292706. Is this something that needs fixing? Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König | Industrial Linux Solutions | https://www.pengutronix.de/ | [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Regression in 8d92fb292706 ("dir: replace exponential algorithm with a linear one") 2021-12-06 12:16 Regression in 8d92fb292706 ("dir: replace exponential algorithm with a linear one") Uwe Kleine-König @ 2021-12-07 2:19 ` brian m. carlson 2021-12-07 5:29 ` Elijah Newren 1 sibling, 0 replies; 4+ messages in thread From: brian m. carlson @ 2021-12-07 2:19 UTC (permalink / raw) To: Uwe Kleine-König; +Cc: git, Elijah Newren, Björn Lässig [-- Attachment #1: Type: text/plain, Size: 765 bytes --] On 2021-12-06 at 12:16:39, Uwe Kleine-König wrote: > Is this something that needs fixing? I don't think so. The old behavior was to add a file that's under a different repository (subdir), which is bizarre and at least to me, unexpected. I believe the new behavior causes us to not add the file because it belongs to a different repo, which seems correct. I have no clue how we could practically check out the file in the first situation if subdir remains a git repository. Even if it does work, it is likely to lead to confusing behavior when the index for subdir is not in sync with the index for the main repo. If I'm missing something here, though, please do point it out. -- brian m. carlson (he/him or they/them) Toronto, Ontario, CA [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 262 bytes --] ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Regression in 8d92fb292706 ("dir: replace exponential algorithm with a linear one") 2021-12-06 12:16 Regression in 8d92fb292706 ("dir: replace exponential algorithm with a linear one") Uwe Kleine-König 2021-12-07 2:19 ` brian m. carlson @ 2021-12-07 5:29 ` Elijah Newren 2021-12-07 7:14 ` Uwe Kleine-König 1 sibling, 1 reply; 4+ messages in thread From: Elijah Newren @ 2021-12-07 5:29 UTC (permalink / raw) To: Uwe Kleine-König; +Cc: Git Mailing List, Björn Lässig On Mon, Dec 6, 2021 at 4:16 AM Uwe Kleine-König <u.kleine-koenig@pengutronix.de> wrote: > > Hello, > > I admit this is somewhat of a corner case, still it happens in the > reality of our admin team ... > Initially this was noticed after upgrading the OS from Debian buster > (with git 2.20.1) to Debian bullseye (with git 2.30.2). > > (wgit is just a wrapper for git to call it from my ~/src/git.) > > This is the good ("old") case: > > uwe@taurus:~/tmp/8d92fb29270$ wgit version > git version 2.25.2.7.g0bbd0e8b5233 > > uwe@taurus:~/tmp/8d92fb292706$ wgit init > Initialized empty Git repository in /home/uwe/tmp/8d92fb292706/.git/ > > uwe@taurus:~/tmp/8d92fb292706$ mkdir subdir > uwe@taurus:~/tmp/8d92fb292706$ cd subdir/ > uwe@taurus:~/tmp/8d92fb292706/subdir$ wgit init > Initialized empty Git repository in /home/uwe/tmp/8d92fb292706/subdir/.git/ > uwe@taurus:~/tmp/8d92fb292706/subdir$ cd .. > > uwe@taurus:~/tmp/8d92fb292706$ echo content > subdir/somefile > uwe@taurus:~/tmp/8d92fb292706$ wgit add subdir/somefile > uwe@taurus:~/tmp/8d92fb292706$ wgit status > On branch master > > No commits yet > > Changes to be committed: > (use "git rm --cached <file>..." to unstage) > new file: subdir/somefile Eek, that's bad. I think there's a number of dragons going down that route. > with 8d92fb292706, the following happens: > > uwe@taurus:~/tmp/8d92fb292706$ wgit version > git version 2.25.2.8.g8d92fb292706 > uwe@taurus:~/tmp/8d92fb292706$ wgit init > Initialized empty Git repository in /home/uwe/tmp/8d92fb292706/.git/ > uwe@taurus:~/tmp/8d92fb292706$ mkdir subdir > uwe@taurus:~/tmp/8d92fb292706$ cd subdir/ > uwe@taurus:~/tmp/8d92fb292706/subdir$ wgit init > Initialized empty Git repository in /home/uwe/tmp/8d92fb292706/subdir/.git/ > uwe@taurus:~/tmp/8d92fb292706/subdir$ cd .. > uwe@taurus:~/tmp/8d92fb292706$ echo content > subdir/somefile > uwe@taurus:~/tmp/8d92fb292706$ wgit add subdir/somefile Not optimal; more on this below. > uwe@taurus:~/tmp/8d92fb292706$ wgit status > On branch master > > No commits yet > > Untracked files: > (use "git add <file>..." to include in what will be committed) > subdir/ > > nothing added to commit but untracked files present (use "git add" to track) This part looks good to me. > So git after 8d92fb292706 doesn't add files from a subdirectory if > said subdirectory is tracked in git, too. > > While I'm not sure which of the two behaviours is the bogus one, this is > a change in behaviour that I guess wasn't intended in 8d92fb292706. I put some effort separate from that commit into avoiding accidentally recursing into nested git dirs; see e.g. commit 09487f2cba ("clean: avoid removing untracked files in a nested git repository", 2019-09-17). So, I was slightly surprised that some other commit hadn't fixed this. However, it's not all that surprising to me that 8d92fb292706 affected this. Prior to that commit, we visited untracked paths which were n directories deep a ridiculous 2^n times. But what made it even more fun was that the status returned for any given path (tracked, ignored, not interesting to the traversal, etc.) was not always the same; later traversals might return something different than earlier traversals. That confusion made it real "fun" trying to ensure no regressions when reducing the number of visits to any given path from 2^n down to 1. The fact that side effects of the traversals (the population of the dir.entries and dir.ignored) could have essentially functioned to override a later traversal's return status certainly didn't help; it was such a mess. But, interestingly, the fixed behavior here also depends pretty strongly on commit b9670c1f5e ("dir: fix checks on common prefix directory", 2019-12-19) which came months earlier. This is particularly important in combination with the following comment from dir.h: /** * If set, recurse into a directory that looks like a Git directory. * Otherwise it is shown as a directory. */ DIR_NO_GITLINKS = 1<<3, which suggests that cmd_add() (which didn't set this flag) should have never been recursing into a Git-tracked directory. In other words, this was a bug all along. > Is this something that needs fixing? I agree with brian elsewhere in this thread that not adding the file is correct. However, two points: * I would prefer to see a warning/error from git add when it doesn't add a path (Any takers? #leftoverbits maybe?) * It is possible that one might want to be able to force the addition of files to an outer repository despite existing within a directory tracked by an inner git repository, perhaps with a double `--force` being passed to git-add (much like git-clean allows). If so, that could be implemented via the addition of dir.flags |= DIR_NO_GITLINKS; to cmd_add() when the double force is detected. Hope that helps, Elijah ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Regression in 8d92fb292706 ("dir: replace exponential algorithm with a linear one") 2021-12-07 5:29 ` Elijah Newren @ 2021-12-07 7:14 ` Uwe Kleine-König 0 siblings, 0 replies; 4+ messages in thread From: Uwe Kleine-König @ 2021-12-07 7:14 UTC (permalink / raw) To: Elijah Newren; +Cc: Git Mailing List, Björn Lässig [-- Attachment #1: Type: text/plain, Size: 6222 bytes --] Hello, first of all thanks for addressing my report, also to brian. On Mon, Dec 06, 2021 at 09:29:59PM -0800, Elijah Newren wrote: > On Mon, Dec 6, 2021 at 4:16 AM Uwe Kleine-König > <u.kleine-koenig@pengutronix.de> wrote: > > I admit this is somewhat of a corner case, still it happens in the > > reality of our admin team ... > > Initially this was noticed after upgrading the OS from Debian buster > > (with git 2.20.1) to Debian bullseye (with git 2.30.2). > > > > (wgit is just a wrapper for git to call it from my ~/src/git.) > > > > This is the good ("old") case: > > > > uwe@taurus:~/tmp/8d92fb29270$ wgit version > > git version 2.25.2.7.g0bbd0e8b5233 > > > > uwe@taurus:~/tmp/8d92fb292706$ wgit init > > Initialized empty Git repository in /home/uwe/tmp/8d92fb292706/.git/ > > > > uwe@taurus:~/tmp/8d92fb292706$ mkdir subdir > > uwe@taurus:~/tmp/8d92fb292706$ cd subdir/ > > uwe@taurus:~/tmp/8d92fb292706/subdir$ wgit init > > Initialized empty Git repository in /home/uwe/tmp/8d92fb292706/subdir/.git/ > > uwe@taurus:~/tmp/8d92fb292706/subdir$ cd .. > > > > uwe@taurus:~/tmp/8d92fb292706$ echo content > subdir/somefile > > uwe@taurus:~/tmp/8d92fb292706$ wgit add subdir/somefile > > uwe@taurus:~/tmp/8d92fb292706$ wgit status > > On branch master > > > > No commits yet > > > > Changes to be committed: > > (use "git rm --cached <file>..." to unstage) > > new file: subdir/somefile > > Eek, that's bad. I think there's a number of dragons going down that route. Yes, as soon as you start to checkout files in one of the involved repositories you implicitly change the other one. In this case git is only used to track files in /etc and elsewhere, and there are two mechanisms to track them. (Don't ask for the reasons, I don't know them. :-) In this setup (I think) the dragons should be well fixed to their chains. > > with 8d92fb292706, the following happens: > > > > uwe@taurus:~/tmp/8d92fb292706$ wgit version > > git version 2.25.2.8.g8d92fb292706 > > uwe@taurus:~/tmp/8d92fb292706$ wgit init > > Initialized empty Git repository in /home/uwe/tmp/8d92fb292706/.git/ > > uwe@taurus:~/tmp/8d92fb292706$ mkdir subdir > > uwe@taurus:~/tmp/8d92fb292706$ cd subdir/ > > uwe@taurus:~/tmp/8d92fb292706/subdir$ wgit init > > Initialized empty Git repository in /home/uwe/tmp/8d92fb292706/subdir/.git/ > > uwe@taurus:~/tmp/8d92fb292706/subdir$ cd .. > > uwe@taurus:~/tmp/8d92fb292706$ echo content > subdir/somefile > > uwe@taurus:~/tmp/8d92fb292706$ wgit add subdir/somefile > > Not optimal; more on this below. > > > uwe@taurus:~/tmp/8d92fb292706$ wgit status > > On branch master > > > > No commits yet > > > > Untracked files: > > (use "git add <file>..." to include in what will be committed) > > subdir/ > > > > nothing added to commit but untracked files present (use "git add" to track) > > This part looks good to me. > > > So git after 8d92fb292706 doesn't add files from a subdirectory if > > said subdirectory is tracked in git, too. > > > > While I'm not sure which of the two behaviours is the bogus one, this is > > a change in behaviour that I guess wasn't intended in 8d92fb292706. > > I put some effort separate from that commit into avoiding accidentally > recursing into nested git dirs; see e.g. commit 09487f2cba ("clean: > avoid removing untracked files in a nested git repository", > 2019-09-17). So, I was slightly surprised that some other commit > hadn't fixed this. > > However, it's not all that surprising to me that 8d92fb292706 affected > this. Prior to that commit, we visited untracked paths which were n > directories deep a ridiculous 2^n times. But what made it even more > fun was that the status returned for any given path (tracked, ignored, > not interesting to the traversal, etc.) was not always the same; later > traversals might return something different than earlier traversals. > That confusion made it real "fun" trying to ensure no regressions when > reducing the number of visits to any given path from 2^n down to 1. > The fact that side effects of the traversals (the population of the > dir.entries and dir.ignored) could have essentially functioned to > override a later traversal's return status certainly didn't help; it > was such a mess. > > But, interestingly, the fixed behavior here also depends pretty > strongly on commit b9670c1f5e ("dir: fix checks on common prefix > directory", 2019-12-19) which came months earlier. This is > particularly important in combination with the following comment from > dir.h: > /** > * If set, recurse into a directory that looks like a Git directory. > * Otherwise it is shown as a directory. > */ > DIR_NO_GITLINKS = 1<<3, > which suggests that cmd_add() (which didn't set this flag) should have > never been recursing into a Git-tracked directory. In other words, > this was a bug all along. > > > Is this something that needs fixing? > > I agree with brian elsewhere in this thread that not adding the file > is correct. However, two points: > > * I would prefer to see a warning/error from git add when it doesn't > add a path (Any takers? #leftoverbits maybe?) Agreed, also maybe return an exit code != 0? > * It is possible that one might want to be able to force the addition > of files to an outer repository despite existing within a directory > tracked by an inner git repository, perhaps with a double `--force` > being passed to git-add (much like git-clean allows). If so, that > could be implemented via the addition of > dir.flags |= DIR_NO_GITLINKS; > to cmd_add() when the double force is detected. I think this would be good. Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König | Industrial Linux Solutions | https://www.pengutronix.de/ | [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2021-12-07 7:14 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-12-06 12:16 Regression in 8d92fb292706 ("dir: replace exponential algorithm with a linear one") Uwe Kleine-König 2021-12-07 2:19 ` brian m. carlson 2021-12-07 5:29 ` Elijah Newren 2021-12-07 7:14 ` Uwe Kleine-König
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).