* possible bug in git apply?
@ 2007-08-04 19:45 david
2007-08-04 20:00 ` Junio C Hamano
` (2 more replies)
0 siblings, 3 replies; 15+ messages in thread
From: david @ 2007-08-04 19:45 UTC (permalink / raw)
To: git; +Cc: rob
since git doesn't track directories, only content (per the big discussion
recently) I beleive that doing a checkout would leave Rob without the
directories that he emptied out, so shouldn't git apply also clear the
directories to end up in the same state?
David Lang
---------- Forwarded message ----------
Date: Sat, 4 Aug 2007 15:19:54 -0400
From: Rob Landley <rob@landley.net>
To: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roland Dreier <rdreier@cisco.com>,
Linux Kernel <linux-kernel@vger.kernel.org>, linux-doc@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>, Paul Mundt <lethal@linux-sh.org>
Subject: Re: [PATCH 1/2] Group architecture Documentation under
Documentation/arch.
On Saturday 04 August 2007 2:03:59 pm Rob Landley wrote:
> Signed-off-by: Rob Landley <rob@landley.net>
> Amiga part Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
>
> Move architecture-specific Documentation into a common subdirectory.
I really, really, really hate git.
Ok, on my laptop I just noticed that "git apply" of the patch didn't complain
but it also left the empty subdirectories it moved stuff out of. (I don't
believe this happened on the version of git I was using on my previous
laptop, which ate itself a month and change ago, but obviously I can't
check.)
There is no "git rmdir". "git rm" refuses to delete the directory
without -r. "git rm -r Documentation/x86_64" listed (as just deleted) all
the files that the patch already moved out of the directory.
Am I missing something obvious here?
Rob
--
"One of my most productive days was throwing away 1000 lines of code."
- Ken Thompson.
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: possible bug in git apply? 2007-08-04 19:45 possible bug in git apply? david @ 2007-08-04 20:00 ` Junio C Hamano 2007-08-04 20:08 ` David Kastrup 2007-08-05 4:48 ` Linus Torvalds 2 siblings, 0 replies; 15+ messages in thread From: Junio C Hamano @ 2007-08-04 20:00 UTC (permalink / raw) To: david; +Cc: git, rob david@lang.hm writes: > since git doesn't track directories, only content (per the big > discussion recently) I beleive that doing a checkout would leave Rob > without the directories that he emptied out, so shouldn't git apply > also clear the directories to end up in the same state? It should and it does, unless you have untracked files in that directory that you haven't told git about. Then removing the directory along with these files will lose the files there, which is a wrong thing to do. A simple "rm -fr this/directory/is/no/longer/interesting/to/me" would be all that is needed. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: possible bug in git apply? 2007-08-04 19:45 possible bug in git apply? david 2007-08-04 20:00 ` Junio C Hamano @ 2007-08-04 20:08 ` David Kastrup 2007-08-05 4:48 ` Linus Torvalds 2 siblings, 0 replies; 15+ messages in thread From: David Kastrup @ 2007-08-04 20:08 UTC (permalink / raw) To: david; +Cc: git, rob david@lang.hm writes: > On Saturday 04 August 2007 2:03:59 pm Rob Landley wrote: >> Signed-off-by: Rob Landley <rob@landley.net> >> Amiga part Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> >> >> Move architecture-specific Documentation into a common subdirectory. > > I really, really, really hate git. > > Ok, on my laptop I just noticed that "git apply" of the patch didn't > complain but it also left the empty subdirectories it moved stuff > out of. (I don't believe this happened on the version of git I was > using on my previous laptop, which ate itself a month and change > ago, but obviously I can't check.) > > There is no "git rmdir". "git rm" refuses to delete the directory > without -r. "git rm -r Documentation/x86_64" listed (as just deleted) all > the files that the patch already moved out of the directory. > > Am I missing something obvious here? Committing the change? > since git doesn't track directories, only content (per the big > discussion recently) I beleive that doing a checkout would leave Rob > without the directories that he emptied out, so shouldn't git apply > also clear the directories to end up in the same state? Yes, once he commits. As long as git keeps files tracked in that directory, there is no reason for it to delete it. I agree that it is hard to come up with a good logic for this sort of thing. git-add checks the _current_ state of a file into the index. git-rm can actually do the same only by actually _deleting_ the working copy. So when should git try deleting the directory? Probably when the directory becomes empty in the index, for consistency. Too bad that the index does not contain any information about directories at all, so there is no good way to figure this particular point in time out efficiently. I guess that git rather attempts deleting the directory when the tree in the _repository_ rather than the index becomes empty. And for that you need to commit. -- David Kastrup, Kriemhildstr. 15, 44793 Bochum ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: possible bug in git apply? 2007-08-04 19:45 possible bug in git apply? david 2007-08-04 20:00 ` Junio C Hamano 2007-08-04 20:08 ` David Kastrup @ 2007-08-05 4:48 ` Linus Torvalds 2007-08-05 4:53 ` Linus Torvalds 2007-08-05 5:11 ` Junio C Hamano 2 siblings, 2 replies; 15+ messages in thread From: Linus Torvalds @ 2007-08-05 4:48 UTC (permalink / raw) To: david; +Cc: git, rob On Sat, 4 Aug 2007, david@lang.hm wrote: > > since git doesn't track directories, only content (per the big discussion > recently) I beleive that doing a checkout would leave Rob without the > directories that he emptied out, so shouldn't git apply also clear the > directories to end up in the same state? It should. I thought we did that, but maybe there's some bug there. See "remove_file()" in builtin-apply.c. But yeah, it seems that the file *rename* ends up not triggering that logic! Very annoying. Does this fix it? Totally untested, but it _looks_ obvious enough.. Linus --- diff --git a/builtin-apply.c b/builtin-apply.c index 0a0b4a9..da27075 100644 --- a/builtin-apply.c +++ b/builtin-apply.c @@ -2508,7 +2508,7 @@ static void write_out_one_result(struct patch *patch, int phase) * thing: remove the old, write the new */ if (phase == 0) - remove_file(patch, 0); + remove_file(patch, patch->is_rename); if (phase == 1) create_file(patch); } ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: possible bug in git apply? 2007-08-05 4:48 ` Linus Torvalds @ 2007-08-05 4:53 ` Linus Torvalds 2007-08-05 5:11 ` Junio C Hamano 1 sibling, 0 replies; 15+ messages in thread From: Linus Torvalds @ 2007-08-05 4:53 UTC (permalink / raw) To: david; +Cc: git, rob On Sat, 4 Aug 2007, Linus Torvalds wrote: > > But yeah, it seems that the file *rename* ends up not triggering that > logic! Very annoying. > > Does this fix it? Totally untested, but it _looks_ obvious enough.. Side note: git will never remove a directory even if it becomes empty as far as *git* is concerned, if there are other files that git doesn't know about in there. So if you really really want to remove that directory, you do end up having to often just doing rm -rf some/dir/ and the fact that you don't find a "git rmdir" is that with git, you really should not even need it. We do "git mv" and friends, but the fact is, that may have confused people into thinking that git cares. The "default mindset" should be to just do everything directly in the filesystem, and git will just figure things out on its own. The "git mv" and "git rm" stuff is purely *convenience* features, nothing more. Other SCM's really want you to do "scm rm" or "scm cp" or whatever. Git really really really doesn't care, and I think people are a bit too afraid of just doing the operation the normal UNIX way. So next time you want to remove a directory, just remove it. With bog-standard unix tools. Use "rm". Or a graphical file manager. Or muck with the disk directly using a binary editor. Git won't care. It will notice that the file is gone, and do the right thing. Linus ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: possible bug in git apply? 2007-08-05 4:48 ` Linus Torvalds 2007-08-05 4:53 ` Linus Torvalds @ 2007-08-05 5:11 ` Junio C Hamano 2007-08-05 7:55 ` David Kastrup 2007-08-05 16:59 ` Linus Torvalds 1 sibling, 2 replies; 15+ messages in thread From: Junio C Hamano @ 2007-08-05 5:11 UTC (permalink / raw) To: Linus Torvalds; +Cc: david, git, rob Linus Torvalds <torvalds@linux-foundation.org> writes: > It should. I thought we did that, but maybe there's some bug there. > > See "remove_file()" in builtin-apply.c. > > But yeah, it seems that the file *rename* ends up not triggering that > logic! Very annoying. > > Does this fix it? Totally untested, but it _looks_ obvious enough.. That would regress the fix made in aea19457, I am afraid. If you are in a subdirectory and the rename patch moves away the last file in your current directory, the shell session you ran the git-apply from will end up in an unlinked directory. Maybe that is a pilot error, and we can revert aea19457 altogether? ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: possible bug in git apply? 2007-08-05 5:11 ` Junio C Hamano @ 2007-08-05 7:55 ` David Kastrup 2007-08-05 16:59 ` Linus Torvalds 1 sibling, 0 replies; 15+ messages in thread From: David Kastrup @ 2007-08-05 7:55 UTC (permalink / raw) To: Junio C Hamano; +Cc: Linus Torvalds, david, git, rob Junio C Hamano <gitster@pobox.com> writes: > Linus Torvalds <torvalds@linux-foundation.org> writes: > >> It should. I thought we did that, but maybe there's some bug there. >> >> See "remove_file()" in builtin-apply.c. >> >> But yeah, it seems that the file *rename* ends up not triggering that >> logic! Very annoying. >> >> Does this fix it? Totally untested, but it _looks_ obvious enough.. > > That would regress the fix made in aea19457, I am afraid. If > you are in a subdirectory and the rename patch moves away the > last file in your current directory, the shell session you ran > the git-apply from will end up in an unlinked directory. > > Maybe that is a pilot error, and we can revert aea19457 > altogether? I'd call it pilot error. We can't really have git's behavior depend on where somebody might point his cwd. If we do that, we may never remove any directory. Once git tracks directories, it will be harder to make this particular pilot error (git rm ../mycwd is more obviously having this effect than git rm *), but a pilot error it is in my book. -- David Kastrup, Kriemhildstr. 15, 44793 Bochum ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: possible bug in git apply? 2007-08-05 5:11 ` Junio C Hamano 2007-08-05 7:55 ` David Kastrup @ 2007-08-05 16:59 ` Linus Torvalds 2007-08-05 17:53 ` David Kastrup 2007-08-06 8:29 ` Junio C Hamano 1 sibling, 2 replies; 15+ messages in thread From: Linus Torvalds @ 2007-08-05 16:59 UTC (permalink / raw) To: Junio C Hamano; +Cc: david, git, rob On Sat, 4 Aug 2007, Junio C Hamano wrote: > > > > Does this fix it? Totally untested, but it _looks_ obvious enough.. > > That would regress the fix made in aea19457, I am afraid. The fix in aea19457 is broken anyway. Why? That whole "do it in two phases" thing breaks it. What can happen is that you have a directory with 100 files, and: - a patch modifies 99 of them - and removes one What happens is that during phase 0, we'll remove all the files (and *not* write new ones), and then beause the last patch entry is a removal, we'll also remove the directory (which succeeds, because all the files that got modified are all long gone). Then, in phase 1, we'll re-create the files that we modified, and create a whole new directory. IOW, as far as I can see we _already_ delete and then recreate the directory structure under some circumstances. I just extended it to also do it for "rename" and not just delete, since a rename may be renaming it to another directory. So I'd say that my patch is a clear improvement on the current situation. That said, if we really wanted to get it right, we should do this as a three-phase thing: (1) remove old files (2) create new files (3) for all removals and renames, try to remove source directories that might have become empty. That would fix it properly and for all cases. Linus ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: possible bug in git apply? 2007-08-05 16:59 ` Linus Torvalds @ 2007-08-05 17:53 ` David Kastrup 2007-08-05 18:18 ` Linus Torvalds 2007-08-06 8:29 ` Junio C Hamano 1 sibling, 1 reply; 15+ messages in thread From: David Kastrup @ 2007-08-05 17:53 UTC (permalink / raw) To: Linus Torvalds; +Cc: Junio C Hamano, david, git, rob Linus Torvalds <torvalds@linux-foundation.org> writes: > On Sat, 4 Aug 2007, Junio C Hamano wrote: >> > >> > Does this fix it? Totally untested, but it _looks_ obvious enough.. >> >> That would regress the fix made in aea19457, I am afraid. > > The fix in aea19457 is broken anyway. > > Why? > > That whole "do it in two phases" thing breaks it. Did not know that. > What can happen is that you have a directory with 100 files, and: > - a patch modifies 99 of them > - and removes one > > What happens is that during phase 0, we'll remove all the files (and *not* > write new ones), and then beause the last patch entry is a removal, we'll > also remove the directory (which succeeds, because all the files that got > modified are all long gone). > > Then, in phase 1, we'll re-create the files that we modified, and create a > whole new directory. > > IOW, as far as I can see we _already_ delete and then recreate the > directory structure under some circumstances. Which will let the user sit in an empty directory void of . and .., and a parallel directory with the old name elsewhere. Unpretty... > I just extended it to also do it for "rename" and not just delete, > since a rename may be renaming it to another directory. > > So I'd say that my patch is a clear improvement on the current > situation. > > That said, if we really wanted to get it right, we should do this as > a three-phase thing: (1) remove old files (2) create new files (3) > for all removals and renames, try to remove source directories that > might have become empty. > > That would fix it properly and for all cases. Stupid question from someone without good background: why do we need two passes in the first place? Can't we just do phase 1/2/3 for every file in one step? Is there any case where not having done (1) for a _different_ file is going to cause trouble for (2)? I presume this is intended for something like create a (plain file, coming in sort order before a/) delete a/x delete a/y (last file in a/) in the index and frankly, this will cease working in your three-phase scheme. The problem is that we really need a -depth sort order for deletion in the index, meaning that delete xxx/yyy sorts before create xxx* When we don't change the sorting order, one can do so with recursion: when finding create a we postpone processing it until a prospective delete a/* is over. That means first processing a.txt, a.whatever/ and so on. I think that is not sane. As far as I can see, changing the index sort order for deletions is probably the sanest _working_ way to go about this. One problem is that corresponding "delete" and "create" items are then no longer necessarily adjacent in the index. The sensible way to go about this is to sort the requests into _two_ linked lists, one in "create" order, one in "delete" order, and when merging a "create" request in the index, one compares with the head of the "create" ordered list, and when merging a "delete" request in the index, one compares with the head of the "delete" ordered list. Is this pretty? Not at all. But I don't see that any _fixed_ number of phases will buy us out of the principal problem, namely that deletions have to be done depth-first, and deletions of a directory have to be finished before we can attempt creating a new file in their place. Yes, this implies changing the index sort order, unfortunately. -- David Kastrup, Kriemhildstr. 15, 44793 Bochum ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: possible bug in git apply? 2007-08-05 17:53 ` David Kastrup @ 2007-08-05 18:18 ` Linus Torvalds 2007-08-05 18:50 ` David Kastrup 0 siblings, 1 reply; 15+ messages in thread From: Linus Torvalds @ 2007-08-05 18:18 UTC (permalink / raw) To: David Kastrup; +Cc: Junio C Hamano, david, git, rob On Sun, 5 Aug 2007, David Kastrup wrote: > > Which will let the user sit in an empty directory void of . and .., > and a parallel directory with the old name elsewhere. Unpretty... It actually depends on the OS. POSIX allows the case of "rmdir()" returning EBUSY if the directory is "in use", where "in use" may mean being somebodys current working directory. But yes, most UNIXes (Linux very much included) will just remove the directory, and yes, the user ends up sitting in a directory that is empty and not reachable from anywhere else. Easy enough to see: [torvalds@woody ~]$ mkdir test-me ; cd test-me ; rmdir $(pwd) [torvalds@woody test-me]$ ls -a [torvalds@woody test-me]$ pwd /home/torvalds/test-me [torvalds@woody test-me]$ /bin/pwd /bin/pwd: couldn't find directory entry in `..' with matching i-node (the first "pwd" is a shell built-in, and the shell caches is). Under Linux, you can also see funky things like this: [torvalds@woody test-me]$ ls -l /proc/self/cwd lrwxrwxrwx 1 torvalds torvalds 0 2007-08-05 11:09 /proc/self/cwd -> /home/torvalds/test-me (deleted) ie khe kernel actually shows you that you're in a deleted directory, that _used_ to be called "test-me". > > That said, if we really wanted to get it right, we should do this as > > a three-phase thing: (1) remove old files (2) create new files (3) > > for all removals and renames, try to remove source directories that > > might have become empty. > > > > That would fix it properly and for all cases. > > Stupid question from someone without good background: why do we need > two passes in the first place? For example, a patch that removes a directory structure "x/..." and then creates a file "x" in its place. In order for the patch ordering to not matter, you want to do the "remove old state" in an earlier phase. And patch order shouldn't matter, since you can have interesting things like mixing of renames and creates etc (ie maybe it's not "removing" that directory x, it's just moving all the contents somewhere else, and maybe the new file "x" is a move from somewhere else). Criss-crossing renames make it even more interesting. So git-apply actually does things in more than two phases: there's a whole another phase that is the "read the patch and create the in-memory result". The "two phases" above are actually just the two phases concerned with actually modifying the working tree, there's more phases elsewhere. Linus ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: possible bug in git apply? 2007-08-05 18:18 ` Linus Torvalds @ 2007-08-05 18:50 ` David Kastrup 2007-08-05 19:20 ` Linus Torvalds 0 siblings, 1 reply; 15+ messages in thread From: David Kastrup @ 2007-08-05 18:50 UTC (permalink / raw) To: Linus Torvalds; +Cc: Junio C Hamano, david, git, rob Linus Torvalds <torvalds@linux-foundation.org> writes: > On Sun, 5 Aug 2007, David Kastrup wrote: [...] >> > That said, if we really wanted to get it right, we should do this as >> > a three-phase thing: (1) remove old files (2) create new files (3) >> > for all removals and renames, try to remove source directories that >> > might have become empty. >> > >> > That would fix it properly and for all cases. >> >> Stupid question from someone without good background: why do we need >> two passes in the first place? > > For example, a patch that removes a directory structure "x/..." and then > creates a file "x" in its place. > > In order for the patch ordering to not matter, you want to do the > "remove old state" in an earlier phase. But your proposed three passes won't work with a patch removing "x/..." and creating "x" in its place, since "x/" gets only removed in pass 3, and "x" needs to created in pass 2 already. If you had bothered reading my mail to the end: I explained exactly that. So your three pass scheme actually breaks the case for which the two-pass scheme has been designed. I propose you read my previous mail to its end: I explain a scheme that will work in this case, but it would, as far as I understand index processing, necessitate changing the index sort order, basically having -depth order for deletion entries. -- David Kastrup, Kriemhildstr. 15, 44793 Bochum ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: possible bug in git apply? 2007-08-05 18:50 ` David Kastrup @ 2007-08-05 19:20 ` Linus Torvalds 2007-08-05 19:37 ` David Kastrup 0 siblings, 1 reply; 15+ messages in thread From: Linus Torvalds @ 2007-08-05 19:20 UTC (permalink / raw) To: David Kastrup; +Cc: Junio C Hamano, david, git, rob On Sun, 5 Aug 2007, David Kastrup wrote: > > But your proposed three passes won't work with a patch removing > "x/..." and creating "x" in its place, since "x/" gets only removed > in pass 3, and "x" needs to created in pass 2 already. Yes, I was wrong. The current two passes are the right thing to do, and we should just always remove empty directories (and my patch was fine: we can leave them alone if it's a pure "modify file in place", but that's really the only case). Linus ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: possible bug in git apply? 2007-08-05 19:20 ` Linus Torvalds @ 2007-08-05 19:37 ` David Kastrup 0 siblings, 0 replies; 15+ messages in thread From: David Kastrup @ 2007-08-05 19:37 UTC (permalink / raw) To: Linus Torvalds; +Cc: Junio C Hamano, david, git, rob Linus Torvalds <torvalds@linux-foundation.org> writes: > On Sun, 5 Aug 2007, David Kastrup wrote: >> >> But your proposed three passes won't work with a patch removing >> "x/..." and creating "x" in its place, since "x/" gets only removed >> in pass 3, and "x" needs to created in pass 2 already. > > Yes, I was wrong. The current two passes are the right thing to do, and we > should just always remove empty directories (and my patch was fine: we can > leave them alone if it's a pure "modify file in place", but that's really > the only case). The consequence will be that renaming all files in one directory (and "all" can even be a single file) will temporarily delete and recreate that directory. My proposed change of index sort and processing order would take care of that without requiring multiple passes, at the cost of changing the index format and processing. I think that it would be a sound long-term solution. Anyway, once directories can be tracked (again necessitating a change of index format), surprising directory deletion and recreation should become less of an issue, but it won't help with projects that continue not tracking directories (presumably most patch-based workflows). -- David Kastrup, Kriemhildstr. 15, 44793 Bochum ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: possible bug in git apply? 2007-08-05 16:59 ` Linus Torvalds 2007-08-05 17:53 ` David Kastrup @ 2007-08-06 8:29 ` Junio C Hamano 2007-08-06 9:37 ` Junio C Hamano 1 sibling, 1 reply; 15+ messages in thread From: Junio C Hamano @ 2007-08-06 8:29 UTC (permalink / raw) To: Linus Torvalds; +Cc: david, git, rob Linus Torvalds <torvalds@linux-foundation.org> writes: > That said, if we really wanted to get it right, we should do this as a > three-phase thing: (1) remove old files (2) create new files (3) for all > removals and renames, try to remove source directories that might have > become empty. > > That would fix it properly and for all cases. Actually that will break the case of removing foo/bar, which is the only file in directory foo, and creating a new file foo. So if we really want to do all the corner cases, we would need to do something like: * scan the list of files that will remain (i.e., renamed-to, modifed-in-place and created-anew) to note which directories should remain in the result; * remove old files, and remove its empty parent directories that are not in the above list; * create new files. But in the meantime for 1.5.3, I think your patch is better than what we currently have. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: possible bug in git apply? 2007-08-06 8:29 ` Junio C Hamano @ 2007-08-06 9:37 ` Junio C Hamano 0 siblings, 0 replies; 15+ messages in thread From: Junio C Hamano @ 2007-08-06 9:37 UTC (permalink / raw) To: Linus Torvalds; +Cc: david, git, rob Junio C Hamano <gitster@pobox.com> writes: > Actually that will break the case of removing foo/bar, which is > the only file in directory foo, and creating a new file foo. Sorry. The creation codepath already has enough code to remove the left-over directory foo/ when creating the file foo. I think your three-step process should be Ok. ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2007-08-06 9:37 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-08-04 19:45 possible bug in git apply? david 2007-08-04 20:00 ` Junio C Hamano 2007-08-04 20:08 ` David Kastrup 2007-08-05 4:48 ` Linus Torvalds 2007-08-05 4:53 ` Linus Torvalds 2007-08-05 5:11 ` Junio C Hamano 2007-08-05 7:55 ` David Kastrup 2007-08-05 16:59 ` Linus Torvalds 2007-08-05 17:53 ` David Kastrup 2007-08-05 18:18 ` Linus Torvalds 2007-08-05 18:50 ` David Kastrup 2007-08-05 19:20 ` Linus Torvalds 2007-08-05 19:37 ` David Kastrup 2007-08-06 8:29 ` Junio C Hamano 2007-08-06 9:37 ` Junio C Hamano
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).