git-rm isn't the inverse action of git-add

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* git-rm isn't the inverse action of git-add
@ 2007-07-02 18:09 Christian Jaeger
  2007-07-02 19:42 ` Yann Dirson
  0 siblings, 1 reply; 37+ messages in thread
From: Christian Jaeger @ 2007-07-02 18:09 UTC (permalink / raw)
  To: git

Hello

I'm coming from cogito. There you can run:

  cg-add $file ; cg-rm $file

and everything is as before; it adds the file to the directory
index/cache, and just removes it again from the latter.

Whereas with git,

  git-add $file; git-rm $file

is giving the error

  error: '..file..' has changes staged in the index (hint: try -f)

And sure enough, git rm -f $file will remove the file from the index,
but also unlink it from the directory. (Ok, I did remember that cogito's
-f option is unlinking the file, so I was cautious and didn't try it on
an important file, but still...)

Turns out that

  git rm  -f --cached $file

will do the same action as cg-rm $file.

Why so complicated? Why not just make git-rm without options behave like
cg-rm? (Or at the very least, I'd change the hint to say "try -f --cached".)

Christian.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: git-rm isn't the inverse action of git-add
  2007-07-02 18:09 git-rm isn't the inverse action of git-add Christian Jaeger
@ 2007-07-02 19:42 ` Yann Dirson
  2007-07-02 20:23   ` Christian Jaeger
  0 siblings, 1 reply; 37+ messages in thread
From: Yann Dirson @ 2007-07-02 19:42 UTC (permalink / raw)
  To: Christian Jaeger; +Cc: git

On Mon, Jul 02, 2007 at 08:09:37PM +0200, Christian Jaeger wrote:
> Why so complicated? Why not just make git-rm without options behave like
> cg-rm? (Or at the very least, I'd change the hint to say "try -f --cached".)

It is probably a matter of taste.  Personally, I am really upset by
this behaviour that cvs, cogito, stgit and others share, which forces
me to issue 2 commands to really delete a file from version control
and from the filesystem.

Do you really need to undo an add more often than you need to remove a
file from version-control ?  It may be worth, however, to make things
easier.  Maybe "git add --undo foo" would be a solution ?  Not sure
we'd want to add --undo to many git commands, though.  Opinions ?

Best regards,
-- 
Yann

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: git-rm isn't the inverse action of git-add
  2007-07-02 19:42 ` Yann Dirson
@ 2007-07-02 20:23   ` Christian Jaeger
  2007-07-02 20:40     ` Yann Dirson
  2007-07-11 12:20     ` Jakub Narebski
  0 siblings, 2 replies; 37+ messages in thread
From: Christian Jaeger @ 2007-07-02 20:23 UTC (permalink / raw)
  To: Yann Dirson; +Cc: git

Yann Dirson wrote:
> On Mon, Jul 02, 2007 at 08:09:37PM +0200, Christian Jaeger wrote:
>   
>> Why so complicated? Why not just make git-rm without options behave like
>> cg-rm? (Or at the very least, I'd change the hint to say "try -f --cached".)
>>     
>
> It is probably a matter of taste.  Personally, I am really upset by
> this behaviour that cvs, cogito, stgit and others share, which forces
> me to issue 2 commands to really delete a file from version control
> and from the filesystem.
>   

It doesn't force you to issue 2 commands: the -f option to cg-rm unlinks
the file for you. So you only have to type an additional "-f".

Yes it's probably (partly) a matter of taste: in my bash startup files I
have mv aliased to mv -i and rm to rm -i, so that it asks me whether I'm
sure to delete or overwrite a file. If I know in advance that I'm sure,
I just type "rm -f $file", which then expands to "rm -i -f $file" where
the -f overrides the -i. cg-rm -f just fits very well into this scheme
(the only difference being that "cg-rm $file" doesn't explicitely ask me
whether I also want the file to be unlinked). (BTW note that usually for
removing a file I use a "trash" (or shorter alias "tra") command, which
moves it to a trash can instead of deleting; so I use "tra $file" by
default, and only for big files or when I'm sure I immediately want to
delete them, I use rm, and then if the paths are clear I add the -f
flag, if not (like globbing involved), I don't add the -f and thus am
asked for confirmation.)

If I could alias the git-rm command so that the default action is the
reverse of git-add and adding an -f flag removes it from disk, that
would be fine for me.

> Do you really need to undo an add more often than you need to remove a
> file from version-control ?  It may be worth, however, to make things
> easier.  Maybe "git add --undo foo" would be a solution ?

This doesn't sound very intuitive to me (and I couldn't fix it with an
alias).

I don't per se require undo actions. I just don't understand why git-rm
refuses to remove the file from the index, even if I didn't commit it.
The index is just an intermediate record of the changes in my
understandings, and the rm action would also be intermediate until it's
being committed. And a non-committed action being deleted shouldn't need
a special confirmation from me, especially not one which is consisting
of a combination of two flags (of which one is a destructive one).

Christian.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: git-rm isn't the inverse action of git-add
  2007-07-02 20:23   ` Christian Jaeger
@ 2007-07-02 20:40     ` Yann Dirson
  2007-07-02 20:54       ` Matthieu Moy
  2007-07-02 21:20       ` git-rm isn't the inverse action of git-add Christian Jaeger
  2007-07-11 12:20     ` Jakub Narebski
  1 sibling, 2 replies; 37+ messages in thread
From: Yann Dirson @ 2007-07-02 20:40 UTC (permalink / raw)
  To: Christian Jaeger; +Cc: git

On Mon, Jul 02, 2007 at 10:23:00PM +0200, Christian Jaeger wrote:
> I don't per se require undo actions. I just don't understand why git-rm
> refuses to remove the file from the index, even if I didn't commit it.

I'd say it does so, so you won't loose any uncommitted changes without
knowing it - and "git add -f" is available when you have checked that
you indeed want to discard that data.

> The index is just an intermediate record of the changes in my
> understandings, and the rm action would also be intermediate until it's
> being committed. And a non-committed action being deleted shouldn't need
> a special confirmation from me, especially not one which is consisting
> of a combination of two flags (of which one is a destructive one).

It already works as such: it will warn you if you have already staged
the file in the index, but it has not been committed, in which case
the data would be lost as well:

$ echo foo > bar
/tmp/test$ git rm bar
fatal: pathspec 'bar' did not match any files
/tmp/test$ git add bar
/tmp/test$ git rm bar
error: 'bar' has changes staged in the index (hint: try -f)

That is, "git rm" will only ever remove the file without asking, when
it is safe do so, in that you can retrieve your file from history.  Or
do you think of another way, in which more safety would be needed ?

Best regards,
-- 
Yann

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: git-rm isn't the inverse action of git-add
  2007-07-02 20:40     ` Yann Dirson
@ 2007-07-02 20:54       ` Matthieu Moy
  2007-07-02 21:05         ` Johannes Schindelin
  2007-07-02 21:20       ` git-rm isn't the inverse action of git-add Christian Jaeger
  1 sibling, 1 reply; 37+ messages in thread
From: Matthieu Moy @ 2007-07-02 20:54 UTC (permalink / raw)
  To: Yann Dirson; +Cc: Christian Jaeger, git

Yann Dirson <ydirson@altern.org> writes:

> That is, "git rm" will only ever remove the file without asking, when
> it is safe do so, in that you can retrieve your file from history.  Or
> do you think of another way, in which more safety would be needed ?

Defaulting to --cached would be an obvious way to avoid data-loss.
_At least_, mentionning --cached in the error message in case of
staged changes would be a considerable step forward.

At the moment, the non-expert user will have difficulties to unversion
the file without deleting it. I just see it as

$ git rm foo
error: 'foo' has changes staged in the index
(hint: to hang yourself, try -f)
$ _

-- 
Matthieu

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: git-rm isn't the inverse action of git-add
  2007-07-02 20:54       ` Matthieu Moy
@ 2007-07-02 21:05         ` Johannes Schindelin
  2007-07-03 10:37           ` Matthieu Moy
  0 siblings, 1 reply; 37+ messages in thread
From: Johannes Schindelin @ 2007-07-02 21:05 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Yann Dirson, Christian Jaeger, git

Hi,

On Mon, 2 Jul 2007, Matthieu Moy wrote:

> Defaulting to --cached would be an obvious way to avoid data-loss. _At 
> least_, mentionning --cached in the error message in case of staged 
> changes would be a considerable step forward.
> 
> At the moment, the non-expert user will have difficulties to unversion 
> the file without deleting it. I just see it as
> 
> $ git rm foo
> error: 'foo' has changes staged in the index
> (hint: to hang yourself, try -f)
> $ _

What's so wrong with our man pages? You know, there have been man hours 
invested in them, and they are exclusively meant for consumption by people 
who do not know about the usage of the commands...

Hth,
Dscho

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: git-rm isn't the inverse action of git-add
  2007-07-02 20:40     ` Yann Dirson
  2007-07-02 20:54       ` Matthieu Moy
@ 2007-07-02 21:20       ` Christian Jaeger
  2007-07-03  4:12         ` Jeff King
  1 sibling, 1 reply; 37+ messages in thread
From: Christian Jaeger @ 2007-07-02 21:20 UTC (permalink / raw)
  To: Yann Dirson; +Cc: git

Yann Dirson wrote:
> On Mon, Jul 02, 2007 at 10:23:00PM +0200, Christian Jaeger wrote:
>   
>> I don't per se require undo actions. I just don't understand why git-rm
>> refuses to remove the file from the index, even if I didn't commit it.
>>     
>
> I'd say it does so, so you won't loose any uncommitted changes without
> knowing it - and "git add -f" is available when you have checked that
> you indeed want to discard that data.
>   

I'm really realising that

git-rm $file # where $file *has* been committed previously

does remove *and* unlink the file. (cg-rm does unlink only with the -f
flag, as said.)

So there's no -f flag in normal git-rm usage. It thus has a different
meaning, namely "force the operation pair of removing from index and
unlinking", not "force this operation also onto the checked out files"
as is the case with cogito.

So I now understand better why they invented the -f flag to git-rm for
the case you're mentioning above and why the hint doesn't warn about
it's danger, since git-rm is always dangerous. (Ok, as is "rm" without
the "-i"; I just found it normal that cogito behaved like my "-i" setup.)

Regarding the issue of "lost files" because they have been created,
added, and removed again before committing: as far as I remember this
has never happened to me with cogito. I commit often, so if I add a file
or a few, in most cases I commit just this fact (that they have been
added), before doing more fancy stuff. I'm maybe used to thinking in
database terms, work that isn't committed is lost. So if I create a file
and add it, in my brain the "attention, uncommitted work" flag is on,
and it usually doesn't happen that I later erroneously think the work
has been committed when in fact it isn't. (I can always check with a
quick cg-status (which shows the files as "A", which makes them stand
out better than in the git-status output)).

Just before writing this mail I had a case where I wanted to remove a
file from versioning control, but keep it on disk (I used git-rm and
that's how I learned that it really also unlinks the local file without
asking(*)). Note that this has not been an undo action; the file has
been committed previously.

(* thanks to git-reset I could get it back of course)

>
> That is, "git rm" will only ever remove the file without asking, when
> it is safe do so, in that you can retrieve your file from history. 

(Well it's not safe if you want to remove the file *from the index* and
naively mis-use the -f flag as suggested by the hint.)

>  Or
> do you think of another way, in which more safety would be needed ?
>   

I think we have just two different points in our view where we think
safety matters.

Regarding the man pages: yes the git-rm man page is fine, and it's nice
to see the manuals are improving. As noted I came from cogito, and
didn't expect git to behave so different with the same named (but
different purpose) options, so I didn't read the man pages (I've been in
irc and asked there, where someone suggested to bring this to the list;
I'm too tired today to think further about it and will try to read more
docs and hope I'll get to understand the git philosophies more).

Christian.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: git-rm isn't the inverse action of git-add
  2007-07-02 21:20       ` git-rm isn't the inverse action of git-add Christian Jaeger
@ 2007-07-03  4:12         ` Jeff King
  2007-07-03  4:47           ` Junio C Hamano
  0 siblings, 1 reply; 37+ messages in thread
From: Jeff King @ 2007-07-03  4:12 UTC (permalink / raw)
  To: Christian Jaeger; +Cc: Yann Dirson, Johannes Schindelin, git

On Mon, Jul 02, 2007 at 11:20:59PM +0200, Christian Jaeger wrote:

> So there's no -f flag in normal git-rm usage. It thus has a different
> meaning, namely "force the operation pair of removing from index and
> unlinking", not "force this operation also onto the checked out files"
> as is the case with cogito.

Yes, git-rm is used in several situations, and the idea is that it
should behave safely in all situations; that is, without -f you can't
delete any data that can't be recovered from your history (but maybe
that means we shouldn't suggest -f in so cavalier a fashion).

Each file has three "states" that we care about: in the HEAD (H), in the
index (I), and in the working tree (W). Let's say you call 'git-rm'.
Here's a table of possibilities (A is some content "A", B is some
content not equal to "A", and N is "non-existant"):

H I W | ok? | why?
---------------------------------------------------
* N * | no  | file is not tracked
N A N | ?   | currently ok, but 'A' recoverable only through fsck
N A A | no  | 'A' recoverable only through fsck
N A B | no  | local modification 'B' would be lost
A A N | yes | 'A' recoverable through history
A A A | yes | 'A' recoverable through history
A A B | no  | local modification 'B' would be lost
A B N | ?   | currently ok, but 'B' recoverable only through fsck
A B A | no  | 'B' recoverable only through fsck
A B B | no  | 'B' recoverable only through fsck
B * * |     | equivalent to H=A

With --cached on, it is a little different:

H I W | ok? | why?
---------------------------------------------------
* N * | no  | file is not tracked
N A N |  ?  | currently ok, but 'A' recoverable only through fsck
N A A |  ?  | currently not ok, but 'A' still available in W
N A B | no  | 'A' recoverable only through fsck
A A N | yes | 'A' recoverable through history
A A A | yes | 'A' recoverable through history or working tree
A A B |  ?  | currently not ok, but 'A' still available in H
A B N |  ?  | currently ok, but 'B' recoverable only through fsck
A B A | no  | 'B' recoverable only through fsck
A B B |  ?  | currently not ok, but 'B' still available in W
B * * |     | equivalent to H=A

So it looks like our safety valve is a bit overbearing in a few
situations, and still misses some situations where data has to be pulled
out of the database with git-fsck.

I think if we actually spell out these possible states in the code, we
can get more accurate behavior, but also more accurate error messages. I
will try to work up a patch.

-Peff

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: git-rm isn't the inverse action of git-add
  2007-07-03  4:12         ` Jeff King
@ 2007-07-03  4:47           ` Junio C Hamano
  2007-07-03  4:59             ` Jeff King
  0 siblings, 1 reply; 37+ messages in thread
From: Junio C Hamano @ 2007-07-03  4:47 UTC (permalink / raw)
  To: Jeff King; +Cc: Christian Jaeger, Yann Dirson, Johannes Schindelin, git

Jeff King <peff@peff.net> writes:

> H I W | ok? | why?
> ---------------------------------------------------
> N A N | ?   | currently ok, but 'A' recoverable only through fsck
> A B N | ?   | currently ok, but 'B' recoverable only through fsck

These were explicitly done per request from git-rm users (myself
not one of them) who wanted to:

	rm the-file
        git rm the-file

sequence not to barf.  I suspect they were from CVS background
who are used to the SCM that complains if you still have the
file in the working tree when you say "scm rm".

I would not mind requiring -f for these cases.

> With --cached on, it is a little different:
>
> H I W | ok? | why?
> ---------------------------------------------------
> N A N |  ?  | currently ok, but 'A' recoverable only through fsck
> N A A |  ?  | currently not ok, but 'A' still available in W
> A A B |  ?  | currently not ok, but 'A' still available in H
> A B N |  ?  | currently ok, but 'B' recoverable only through fsck
> A B B |  ?  | currently not ok, but 'B' still available in W

I personally do not think we would need any safety check for
"git rm --cached", as it does not touch the working tree.  If
one cares about the differences among three states, one would
not issue "rm --cached" anyway.  The only reason "rm --cached"
is used is because one _knows_ that any blob should not exist at
that path in the index.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: git-rm isn't the inverse action of git-add
  2007-07-03  4:47           ` Junio C Hamano
@ 2007-07-03  4:59             ` Jeff King
  2007-07-03  5:09               ` Junio C Hamano
  0 siblings, 1 reply; 37+ messages in thread
From: Jeff King @ 2007-07-03  4:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Christian Jaeger, Yann Dirson, Johannes Schindelin, git

On Mon, Jul 02, 2007 at 09:47:33PM -0700, Junio C Hamano wrote:

> These were explicitly done per request from git-rm users (myself
> not one of them) who wanted to:
> 
> 	rm the-file
>         git rm the-file

Ah, makes sense (if such a thing can be said about CVS behavior).

> > H I W | ok? | why?
> > ---------------------------------------------------
> > N A N |  ?  | currently ok, but 'A' recoverable only through fsck
> > N A A |  ?  | currently not ok, but 'A' still available in W
> > A A B |  ?  | currently not ok, but 'A' still available in H
> > A B N |  ?  | currently ok, but 'B' recoverable only through fsck
> > A B B |  ?  | currently not ok, but 'B' still available in W
> 
> I personally do not think we would need any safety check for
> "git rm --cached", as it does not touch the working tree.  If

It depends on how we want to define "lost" data. In many cases, we are
protecting against losing content that will still be available until the
next git-prune. Should our safety valve protect against that case, or
should it not? We are totally inconsistent.

The main one for --cached, of course, is when that content exists _only_
in the index, but no longer in the working tree (!A A N or !A A B).  You
really should be using regular git-rm (in the first case, since you are
saying "I don't want this file anymore") or git-add (throw out the old
data, use my new version).

OTOH, clearly git-add can "lose" data in this way as well, since a
"modify, git-add, modify, git-add" will "lose" any reference to the
index state after the first add. So maybe that is not worth worrying
about at all (in which case our safety valve is too strict in many
places).

We could also issue a warning when "losing" reference to data that is in
the object db, which would include the sha1; in that case, an immediate
"oops" could be rectified with git-show.

> one cares about the differences among three states, one would
> not issue "rm --cached" anyway.  The only reason "rm --cached"
> is used is because one _knows_ that any blob should not exist at
> that path in the index.

How about:

  git-add foo
  echo changes >>foo
  # oops, I don't want to commit foo just yet
  git-rm --cached foo

but in that case, maybe the user doesn't actually _care_ about that
intermediate state of 'foo'.

-Peff

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: git-rm isn't the inverse action of git-add
  2007-07-03  4:59             ` Jeff King
@ 2007-07-03  5:09               ` Junio C Hamano
  2007-07-03  5:12                 ` Jeff King
  0 siblings, 1 reply; 37+ messages in thread
From: Junio C Hamano @ 2007-07-03  5:09 UTC (permalink / raw)
  To: Jeff King; +Cc: Christian Jaeger, Yann Dirson, Johannes Schindelin, git

Jeff King <peff@peff.net> writes:

> OTOH, clearly git-add can "lose" data in this way as well, since a
> "modify, git-add, modify, git-add" will "lose" any reference to the
> index state after the first add. So maybe that is not worth worrying
> about at all (in which case our safety valve is too strict in many
> places).

Exactly.  And not considering that lossage helps us keep our
sanity.  I think "git rm --cached" falls into the same
category.  If the user wants to discard what is in the index
without losing a copy in the working tree, I think we should let
him do without fuss.

>   git-add foo
>   echo changes >>foo
>   # oops, I don't want to commit foo just yet
>   git-rm --cached foo
>
> but in that case, maybe the user doesn't actually _care_ about that
> intermediate state of 'foo'.

Yes, that is (at least, "used to be") exactly the use case "rm
--cached" is supposed to help.  Added something prematurely to
the index, not ready to commit that part of the changes yet.
Of course you could do partial commits with "add --interactive"
these days, so there is not as much need for this as it used to
be anymore.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: git-rm isn't the inverse action of git-add
  2007-07-03  5:09               ` Junio C Hamano
@ 2007-07-03  5:12                 ` Jeff King
  2007-07-03  6:26                   ` Junio C Hamano
  0 siblings, 1 reply; 37+ messages in thread
From: Jeff King @ 2007-07-03  5:12 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Christian Jaeger, Yann Dirson, Johannes Schindelin, git

On Mon, Jul 02, 2007 at 10:09:35PM -0700, Junio C Hamano wrote:

> Exactly.  And not considering that lossage helps us keep our
> sanity.  I think "git rm --cached" falls into the same
> category.  If the user wants to discard what is in the index
> without losing a copy in the working tree, I think we should let
> him do without fuss.

OK. So should we _remove_ the safety valve in all cases where we're just
losing stuff that's in the index? It is, after all, recoverable. Should
there be a warning (I suspect it would get annoying very quickly)?

I think this would help by making the use of '-f' more rare, which is
the thing that can _really_ screw you, since it turns off the safety
valve even for things that aren't recoverable.

-Peff

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: git-rm isn't the inverse action of git-add
  2007-07-03  5:12                 ` Jeff King
@ 2007-07-03  6:26                   ` Junio C Hamano
  0 siblings, 0 replies; 37+ messages in thread
From: Junio C Hamano @ 2007-07-03  6:26 UTC (permalink / raw)
  To: Jeff King; +Cc: Christian Jaeger, Yann Dirson, Johannes Schindelin, git

Jeff King <peff@peff.net> writes:

> On Mon, Jul 02, 2007 at 10:09:35PM -0700, Junio C Hamano wrote:
>
>> Exactly.  And not considering that lossage helps us keep our
>> sanity.  I think "git rm --cached" falls into the same
>> category.  If the user wants to discard what is in the index
>> without losing a copy in the working tree, I think we should let
>> him do without fuss.
>
> OK. So should we _remove_ the safety valve in all cases where we're just
> losing stuff that's in the index? It is, after all, recoverable. Should
> there be a warning (I suspect it would get annoying very quickly)?

I personally do not think we would need any safety check for
"git rm --cached", as it does not touch the working tree.  For
non-cached case I think the current behaviour is fine.

But I should warn you that I rarely use "git rm" myself.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: git-rm isn't the inverse action of git-add
  2007-07-02 21:05         ` Johannes Schindelin
@ 2007-07-03 10:37           ` Matthieu Moy
  2007-07-03 12:09             ` Johannes Schindelin
  0 siblings, 1 reply; 37+ messages in thread
From: Matthieu Moy @ 2007-07-03 10:37 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Yann Dirson, Christian Jaeger, git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> What's so wrong with our man pages? You know, there have been man hours 
> invested in them, and they are exclusively meant for consumption by people 
> who do not know about the usage of the commands...

What's wrong is just that I shouldn't have to read a man page to avoid
data-loss. I should have to read them to do non-trivial things, for
sure.

Useability is not just about documenting surprising behaviors, it's
really about avoiding them.

-- 
Matthieu

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: git-rm isn't the inverse action of git-add
  2007-07-03 10:37           ` Matthieu Moy
@ 2007-07-03 12:09             ` Johannes Schindelin
  2007-07-03 13:40               ` Matthieu Moy
  0 siblings, 1 reply; 37+ messages in thread
From: Johannes Schindelin @ 2007-07-03 12:09 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Yann Dirson, Christian Jaeger, git

Hi,

On Tue, 3 Jul 2007, Matthieu Moy wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > What's so wrong with our man pages? You know, there have been man 
> > hours invested in them, and they are exclusively meant for consumption 
> > by people who do not know about the usage of the commands...
> 
> What's wrong is just that I shouldn't have to read a man page to avoid
> data-loss.

Okay, Mr Moy. How did you learn that "rm" leads to data-loss? Because it 
does. Hmm. How did you expect then, that git-rm does _not_ lead to data 
loss? If in doubt, you _have_ to read the manual. Especially if the tool 
is powerful.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: git-rm isn't the inverse action of git-add
  2007-07-03 12:09             ` Johannes Schindelin
@ 2007-07-03 13:40               ` Matthieu Moy
  2007-07-03 14:21                 ` Johannes Schindelin
  2007-07-04 20:08                 ` Jan Hudec
  0 siblings, 2 replies; 37+ messages in thread
From: Matthieu Moy @ 2007-07-03 13:40 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Yann Dirson, Christian Jaeger, git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Hi,
>
> On Tue, 3 Jul 2007, Matthieu Moy wrote:
>
>> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>> 
>> > What's so wrong with our man pages? You know, there have been man 
>> > hours invested in them, and they are exclusively meant for consumption 
>> > by people who do not know about the usage of the commands...
>> 
>> What's wrong is just that I shouldn't have to read a man page to avoid
>> data-loss.
>
> Okay, Mr Moy.

Glad to be called by my name. Is it a tradition here, or a way to make
fun of me?

> How did you learn that "rm" leads to data-loss? Because it does.

It obviously does, and I can't imagine any other behavior than
deleting the file for a command like "rm".

> Hmm. How did you expect then, that git-rm does _not_ lead to data
> loss? 

Because there are tons of possible behaviors for "$VCS rm", and I'd
expect it to be safe even if VCS=git, since it is with all the other
VCS I know.

What's wrong with the behavior of "hg rm"?
What's wrong with the behavior of "svn rm"?
What's wrong with the behavior of "bzr rm"?
(no, I won't do it with CVS ;-) )

None of these commands have the problem that git-rm has.

-- 
Matthieu

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: git-rm isn't the inverse action of git-add
  2007-07-03 13:40               ` Matthieu Moy
@ 2007-07-03 14:21                 ` Johannes Schindelin
  2007-07-04 20:08                 ` Jan Hudec
  1 sibling, 0 replies; 37+ messages in thread
From: Johannes Schindelin @ 2007-07-03 14:21 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Yann Dirson, Christian Jaeger, git

Hi,

On Tue, 3 Jul 2007, Matthieu Moy wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > On Tue, 3 Jul 2007, Matthieu Moy wrote:
> >
> >> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> >> 
> >> > What's so wrong with our man pages? You know, there have been man 
> >> > hours invested in them, and they are exclusively meant for 
> >> > consumption by people who do not know about the usage of the 
> >> > commands...
> >> 
> >> What's wrong is just that I shouldn't have to read a man page to 
> >> avoid data-loss.
> >
> > Okay, Mr Moy.
> 
> Glad to be called by my name. Is it a tradition here, or a way to make 
> fun of me?

I tried to be funny, by introducing some diversity...

> > How did you learn that "rm" leads to data-loss? Because it does.
> 
> It obviously does, and I can't imagine any other behavior than deleting 
> the file for a command like "rm".
> 
> > Hmm. How did you expect then, that git-rm does _not_ lead to data
> > loss? 
> 
> Because there are tons of possible behaviors for "$VCS rm", and I'd 
> expect it to be safe even if VCS=git, since it is with all the other VCS 
> I know.

Which proves exactly my point. There are a ton of interpretations that 
make sense. So I would always look into the man page.

> What's wrong with the behavior of "hg rm"?
> What's wrong with the behavior of "svn rm"?
> What's wrong with the behavior of "bzr rm"?
> (no, I won't do it with CVS ;-) )
> 
> None of these commands have the problem that git-rm has.

Guess what. I do not know how they operate! I have no idea what the 
behaviour of the commands you mentioned is. So before I would answer (if 
they were not rethoric questions), I would actually really read the man 
page to know what they are supposed to do.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: git-rm isn't the inverse action of git-add
  2007-07-03 13:40               ` Matthieu Moy
  2007-07-03 14:21                 ` Johannes Schindelin
@ 2007-07-04 20:08                 ` Jan Hudec
  2007-07-05 13:44                   ` Matthieu Moy
  1 sibling, 1 reply; 37+ messages in thread
From: Jan Hudec @ 2007-07-04 20:08 UTC (permalink / raw)
  To: Johannes Schindelin, Yann Dirson, Christian Jaeger, git

[-- Attachment #1: Type: text/plain, Size: 1228 bytes --]

On Tue, Jul 03, 2007 at 15:40:12 +0200, Matthieu Moy wrote:
> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> > Hmm. How did you expect then, that git-rm does _not_ lead to data
> > loss? 
> 
> Because there are tons of possible behaviors for "$VCS rm", and I'd
> expect it to be safe even if VCS=git, since it is with all the other
> VCS I know.
> 
> What's wrong with the behavior of "hg rm"?
> What's wrong with the behavior of "svn rm"?
> What's wrong with the behavior of "bzr rm"?
> (no, I won't do it with CVS ;-) )
> 
> None of these commands have the problem that git-rm has.

Hm. They all behave roughly the same: They unversion the file and unlink it,
unless it is modified, in which case they unversion it and leave it alone.

Now git has the extra complexity that index contains also content of the
file. But the behaviour can be easily adapted like this (HEAD = version in
HEAD, index = version in index, tree = version in tree):
 - if (HEAD == index && index == version) unversion and unlink
 - else if (HEAD == index || index == version) unversion
 - else print message and do nothing

Would you consider that a sane behaviour?

-- 
						 Jan 'Bulb' Hudec <bulb@ucw.cz>

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: git-rm isn't the inverse action of git-add
  2007-07-04 20:08                 ` Jan Hudec
@ 2007-07-05 13:44                   ` Matthieu Moy
  2007-07-05 14:00                     ` David Kastrup
  2007-07-08 17:36                     ` [RFC][PATCH] " Matthieu Moy
  0 siblings, 2 replies; 37+ messages in thread
From: Matthieu Moy @ 2007-07-05 13:44 UTC (permalink / raw)
  To: Jan Hudec; +Cc: Johannes Schindelin, Yann Dirson, Christian Jaeger, git

Jan Hudec <bulb@ucw.cz> writes:

>> What's wrong with the behavior of "hg rm"?
>> What's wrong with the behavior of "svn rm"?
>> What's wrong with the behavior of "bzr rm"?
>> (no, I won't do it with CVS ;-) )
>> 
>> None of these commands have the problem that git-rm has.
>
> Hm. They all behave roughly the same: They unversion the file and unlink it,
> unless it is modified, in which case they unversion it and leave it
> alone.

Yes. Roughly, they'll ask you a --force flag whenever you'd risk
data-loss. bzr gives you the choice between --force and --keep (that
would be --cached in git) if the file doesn't match HEAD.

> Now git has the extra complexity that index contains also content of the
> file. But the behaviour can be easily adapted like this (HEAD = version in
> HEAD, index = version in index, tree = version in tree):
                                  ^^^^- I suppose you meant "version"
                                        here since you don't use
                                        "tree" after.

>  - if (HEAD == index && index == version) unversion and unlink

Just to be more precise:

   - if (HEAD == index && index == version) unversion and
       * if (--cached is not given) unlink
       * else do nothing

>  - else if (HEAD == index || index == version) unversion
>  - else print message and do nothing
>
> Would you consider that a sane behaviour?

To me, that's a sane behavior.

It makes a few senarios easy and safe, like this:

  $ git add <whatever>
  # Ooops, no, I didn't want to version this one :-(
  $ git rm some-file
  # Cool, I just cancelled my mistake without loosing anything ;-)
  
One benefit is: you don't have to use "-f" for a non-dangerous
senario. That seems stupid, but for the plain "rm" command, the "-rf"
is hardcoded in the fingers of many unix users, and I know several
people having lost data by typing it a bit too mechanically (with a
typo behind, like forgetting the "*" in "*~" ;-).

I'll try writting patch for that if people agree that this is saner
that the current behavior.

-- 
Matthieu

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: git-rm isn't the inverse action of git-add
  2007-07-05 13:44                   ` Matthieu Moy
@ 2007-07-05 14:00                     ` David Kastrup
  2007-07-08 17:36                     ` [RFC][PATCH] " Matthieu Moy
  1 sibling, 0 replies; 37+ messages in thread
From: David Kastrup @ 2007-07-05 14:00 UTC (permalink / raw)
  To: git

Matthieu Moy <Matthieu.Moy@imag.fr> writes:

> One benefit is: you don't have to use "-f" for a non-dangerous
> senario. That seems stupid, but for the plain "rm" command, the
> "-rf" is hardcoded in the fingers of many unix users, and I know
> several people having lost data by typing it a bit too mechanically
> (with a typo behind, like forgetting the "*" in "*~" ;-).

Just a few days ago, I used rm -rf * in a temporary directory.  I
would now advise people against doing that without an absolute path.
The problem was that at some later point of time, some history
search/key fsckup popped that line back into the shell and executed
it.

At that time, in my home directory.  This was definitely annoying,
even though the files and directories .* (and thus most configuration
data) were spared.

> I'll try writting patch for that if people agree that this is saner
> that the current behavior.

Sounds like it.

-- 
David Kastrup

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [RFC][PATCH] Re: git-rm isn't the inverse action of git-add
  2007-07-05 13:44                   ` Matthieu Moy
  2007-07-05 14:00                     ` David Kastrup
@ 2007-07-08 17:36                     ` Matthieu Moy
  2007-07-08 18:10                       ` Johannes Schindelin
  1 sibling, 1 reply; 37+ messages in thread
From: Matthieu Moy @ 2007-07-08 17:36 UTC (permalink / raw)
  To: git; +Cc: Jan Hudec, Johannes Schindelin, Yann Dirson, Christian Jaeger

Matthieu Moy <Matthieu.Moy@imag.fr> writes:

>>  - if (HEAD == index && index == version) unversion and unlink
>
> Just to be more precise:
>
>    - if (HEAD == index && index == version) unversion and
>        * if (--cached is not given) unlink
>        * else do nothing
>
>>  - else if (HEAD == index || index == version) unversion
>>  - else print message and do nothing
>>
>> Would you consider that a sane behaviour?

[...]

> I'll try writting patch for that if people agree that this is saner
> that the current behavior.

Here's a first attempt (I'm still not familiar with the git codebase,
so the patch is probably not so good).

Note: currently, git-rm still shows those "rm '...'" messages on
stdout. AAUI, they were actually useful at a time when git-rm didn't
actually remove the files, and people actually ran the "rm" commands
after. They can probably be removed now, but that's another topic.


>From f4f4aa047b2b9050d968704d1f2db07b2a1a79cc Mon Sep 17 00:00:00 2001
From: Matthieu Moy <Matthieu.Moy@imag.fr>
Date: Sun, 8 Jul 2007 19:27:44 +0200
Subject: [PATCH] Make git-rm obey in more circumstances.

In the previous behavior of git-rm, git refused to do anything in case of
a difference between the file on disk, the index, and the HEAD. As a
result, the -f flag is forced even for simple senarios like:

$ git add foo
# oops, I didn't want to version it
$ git rm -f [--cached] foo
# foo is deleted on disk if --cached isn't provided.

This patch proposes a saner behavior. When there are no difference at all
between file, index and HEAD, the file is removed both from the index and
the tree, as before.

Otherwise, if the index matches either the file on disk or the HEAD, the
file is removed from the index, but the file is kept on disk, it may
contain important data.

Otherwise, that's an error, and git-rm aborts.

The above senario becomes

$ git add foo
$ git rm foo
# back to the initial state.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
---
 Documentation/git-rm.txt |    9 ++++---
 builtin-rm.c             |   55 ++++++++++++++++++++++++++++++++++++---------
 2 files changed, 49 insertions(+), 15 deletions(-)

diff --git a/Documentation/git-rm.txt b/Documentation/git-rm.txt
index 78f45dc..180671c 100644
--- a/Documentation/git-rm.txt
+++ b/Documentation/git-rm.txt
@@ -11,10 +11,11 @@ SYNOPSIS
 
 DESCRIPTION
 -----------
-Remove files from the working tree and from the index.  The
-files have to be identical to the tip of the branch, and no
-updates to its contents must have been placed in the staging
-area (aka index).
+Remove files from the working tree and from the index. The content
+placed in the staging area (aka index) must match either the content
+of the file on disk, or the tip of the branch. If it matches only one
+of them, the file is kept on disk for safety, but is still removed
+from the index.
 
 
 OPTIONS
diff --git a/builtin-rm.c b/builtin-rm.c
index 4a0bd93..d2a8998 100644
--- a/builtin-rm.c
+++ b/builtin-rm.c
@@ -12,18 +12,27 @@
 static const char builtin_rm_usage[] =
 "git-rm [-f] [-n] [-r] [--cached] [--ignore-unmatch] [--quiet] [--] <file>...";
 
+struct file_info {
+	const char *name;
+	int local_changes;
+	int staged_changes;
+};
+
 static struct {
 	int nr, alloc;
-	const char **name;
+	struct file_info * files;
 } list;
 
 static void add_list(const char *name)
 {
 	if (list.nr >= list.alloc) {
 		list.alloc = alloc_nr(list.alloc);
-		list.name = xrealloc(list.name, list.alloc * sizeof(const char *));
+		list.files = xrealloc(list.files, list.alloc * sizeof(const char *));
 	}
-	list.name[list.nr++] = name;
+	list.files[list.nr].name = name;
+	list.files[list.nr].local_changes  = 0;
+	list.files[list.nr].staged_changes = 0;
+	list.nr++;
 }
 
 static int remove_file(const char *name)
@@ -46,6 +55,26 @@ static int remove_file(const char *name)
 	return ret;
 }
 
+static int remove_file_maybe(const struct file_info fi, int quiet)
+{
+	const char *path = fi.name;
+	if (!fi.local_changes && !fi.staged_changes) {
+		/* The file matches either the index or the HEAD.
+		 * It's content exists somewhere else, it's safe to
+		 * delete it.
+		 */
+		return remove_file(path);
+	} else {
+		if (!quiet)
+			fprintf(stderr, 
+				"note: file '%s' not removed "
+				"(doesn't match %s).\n",
+				path,
+				fi.local_changes?"the index":"HEAD");
+		return 0;
+	}
+}
+
 static int check_local_mod(unsigned char *head)
 {
 	/* items in list are already sorted in the cache order,
@@ -62,7 +91,7 @@ static int check_local_mod(unsigned char *head)
 		struct stat st;
 		int pos;
 		struct cache_entry *ce;
-		const char *name = list.name[i];
+		const char *name = list.files[i].name;
 		unsigned char sha1[20];
 		unsigned mode;
 
@@ -87,13 +116,17 @@ static int check_local_mod(unsigned char *head)
 			continue;
 		}
 		if (ce_match_stat(ce, &st, 0))
-			errs = error("'%s' has local modifications "
-				     "(hint: try -f)", ce->name);
+			list.files[i].local_changes = 1;
+
 		if (no_head
 		     || get_tree_entry(head, name, sha1, &mode)
 		     || ce->ce_mode != create_ce_mode(mode)
 		     || hashcmp(ce->sha1, sha1))
-			errs = error("'%s' has changes staged in the index "
+			list.files[i].staged_changes = 1;
+
+		if (list.files[i].local_changes && 
+		    list.files[i].staged_changes)
+			errs = error("'%s' doesn't match neither HEAD nor the index "
 				     "(hint: try -f)", name);
 	}
 	return errs;
@@ -201,7 +234,7 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
 	 * the index unless all of them succeed.
 	 */
 	for (i = 0; i < list.nr; i++) {
-		const char *path = list.name[i];
+		const char *path = list.files[i].name;
 		if (!quiet)
 			printf("rm '%s'\n", path);
 
@@ -224,13 +257,13 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
 	if (!index_only) {
 		int removed = 0;
 		for (i = 0; i < list.nr; i++) {
-			const char *path = list.name[i];
-			if (!remove_file(path)) {
+			if (!remove_file_maybe(list.files[i], quiet)) {
 				removed = 1;
 				continue;
 			}
 			if (!removed)
-				die("git-rm: %s: %s", path, strerror(errno));
+				die("git-rm: %s: %s", 
+				    list.files[i].name, strerror(errno));
 		}
 	}
 
-- 
1.5.3.rc0.63.gc956-dirty



-- 
Matthieu

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [RFC][PATCH] Re: git-rm isn't the inverse action of git-add
  2007-07-08 17:36                     ` [RFC][PATCH] " Matthieu Moy
@ 2007-07-08 18:10                       ` Johannes Schindelin
  2007-07-08 20:34                         ` Matthieu Moy
  0 siblings, 1 reply; 37+ messages in thread
From: Johannes Schindelin @ 2007-07-08 18:10 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: git, Jan Hudec, Yann Dirson, Christian Jaeger

Hi,

On Sun, 8 Jul 2007, Matthieu Moy wrote:

> Subject: [PATCH] Make git-rm obey in more circumstances.

This is not really a good patch title.  Since it only obeys your 
particular understanding of what it should do.  You are changing 
semantics, and you should say so.

> In the previous behavior of git-rm, git refused to do anything in case 
> of a difference between the file on disk, the index, and the HEAD. As a 
> result, the -f flag is forced even for simple senarios like:
> 
> $ git add foo
> # oops, I didn't want to version it
> $ git rm -f [--cached] foo
> # foo is deleted on disk if --cached isn't provided.
> 
> This patch proposes a saner behavior. When there are no difference at 
> all between file, index and HEAD, the file is removed both from the 
> index and the tree, as before.
> 
> Otherwise, if the index matches either the file on disk or the HEAD, the 
> file is removed from the index, but the file is kept on disk, it may 
> contain important data.

However, if some of the files are of the first kind, and some are of the 
second kind, you happily apply with mixed strategies.  IMO that is wrong.

>  static struct {
>  	int nr, alloc;
> -	const char **name;
> +	struct file_info * files;
>  } list;
>  
>  static void add_list(const char *name)
>  {
>  	if (list.nr >= list.alloc) {
>  		list.alloc = alloc_nr(list.alloc);
> -		list.name = xrealloc(list.name, list.alloc * sizeof(const char *));
> +		list.files = xrealloc(list.files, list.alloc * sizeof(const char *));

This is wrong, too.  Yes, it works.  But it really should be 
"sizeof(struct file_info *)".  Remember, code is also documentation.

> +static int remove_file_maybe(const struct file_info fi, int quiet)
> +{
> +	const char *path = fi.name;
> +	if (!fi.local_changes && !fi.staged_changes) {
> +		/* The file matches either the index or the HEAD.
> +		 * It's content exists somewhere else, it's safe to
> +		 * delete it.
> +		 */
> +		return remove_file(path);
> +	} else {

Superfluous "{ .. }".

> +		if (!quiet)
> +			fprintf(stderr, 
> +				"note: file '%s' not removed "
> +				"(doesn't match %s).\n",
> +				path,
> +				fi.local_changes?"the index":"HEAD");
> +		return 0;
> +	}
> +}

I suspect that this case does never fail. 0 means success for 
remove_file().  Not good.  You should at least have a way to ensure that 
it removed the files from the working tree from a script.  Otherwise there 
is not much point in returning a value to begin with.

> @@ -224,13 +257,13 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
>  	if (!index_only) {
>  		int removed = 0;
>  		for (i = 0; i < list.nr; i++) {
> -			const char *path = list.name[i];
> -			if (!remove_file(path)) {
> +			if (!remove_file_maybe(list.files[i], quiet)) {
>  				removed = 1;
>  				continue;
>  			}
>  			if (!removed)
> -				die("git-rm: %s: %s", path, strerror(errno));
> +				die("git-rm: %s: %s", 
> +				    list.files[i].name, strerror(errno));
>  		}
>  	}

Style: the old code set and used "path" for readability.  You should do 
the same (with "file", probably).

Additionally, since this changes semantics, you better provide test cases 
to show what is expected to work, and _ensure_ that it actually works.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC][PATCH] Re: git-rm isn't the inverse action of git-add
  2007-07-08 18:10                       ` Johannes Schindelin
@ 2007-07-08 20:34                         ` Matthieu Moy
  2007-07-08 21:49                           ` Johannes Schindelin
  0 siblings, 1 reply; 37+ messages in thread
From: Matthieu Moy @ 2007-07-08 20:34 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Jan Hudec, Yann Dirson, Christian Jaeger

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>> This patch proposes a saner behavior. When there are no difference at 
>> all between file, index and HEAD, the file is removed both from the 
>> index and the tree, as before.
>> 
>> Otherwise, if the index matches either the file on disk or the HEAD, the 
>> file is removed from the index, but the file is kept on disk, it may 
>> contain important data.
>
> However, if some of the files are of the first kind, and some are of the 
> second kind, you happily apply with mixed strategies.  IMO that is wrong.

I'm not sure whether this is really wrong. The things git should
really care about are the index and the repository itself, and the
proposed behavior is consistant regarding that (either remove all
files from the index, or remove none).

I'm not opposed to your proposal, but I'd like to have other
opinion(s) on that before changing the code.

>>  static struct {
>>  	int nr, alloc;
>> -	const char **name;
>> +	struct file_info * files;
>>  } list;
>>  
>>  static void add_list(const char *name)
>>  {
>>  	if (list.nr >= list.alloc) {
>>  		list.alloc = alloc_nr(list.alloc);
>> -		list.name = xrealloc(list.name, list.alloc * sizeof(const char *));
>> +		list.files = xrealloc(list.files, list.alloc * sizeof(const char *));
>
> This is wrong, too.  Yes, it works.  But it really should be 
> "sizeof(struct file_info *)".  Remember, code is also documentation.

You don't need to argue, that was a typo. My code is definitely wrong,
but you're wrong too ;-). That's actually sizeof(struct file_info).

>> +		if (!quiet)
>> +			fprintf(stderr, 
>> +				"note: file '%s' not removed "
>> +				"(doesn't match %s).\n",
>> +				path,
>> +				fi.local_changes?"the index":"HEAD");
>> +		return 0;
>> +	}
>> +}
>
> I suspect that this case does never fail. 0 means success for 
> remove_file().  Not good.  You should at least have a way to ensure that 
> it removed the files from the working tree from a script.  Otherwise there 
> is not much point in returning a value to begin with.

I've changed it to have exit_status = 1 if git-rm aborted before
starting, and 2 if git-rm skiped some file removals (and of course, 0
if everything is done as expected).

>> @@ -224,13 +257,13 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
>>  	if (!index_only) {
>>  		int removed = 0;
>>  		for (i = 0; i < list.nr; i++) {
>> -			const char *path = list.name[i];
>> -			if (!remove_file(path)) {
>> +			if (!remove_file_maybe(list.files[i], quiet)) {
>>  				removed = 1;
>>  				continue;
>>  			}
>>  			if (!removed)
>> -				die("git-rm: %s: %s", path, strerror(errno));
>> +				die("git-rm: %s: %s", 
>> +				    list.files[i].name, strerror(errno));
>>  		}
>>  	}
>
> Style: the old code set and used "path" for readability.  You should do 
> the same (with "file", probably).

Done.

> Additionally, since this changes semantics, you better provide test cases 
> to show what is expected to work, and _ensure_ that it actually works.

Sure. I forgot to mention it in my message, but I wanted to have
feedback before getting into the testsuite stuff.

I'm posting the updated patch for info, but it should anyway not be
merged until

* We agree on the behavior when different files have different kinds
  of changes

* I add a testcase.



>From f39ae646049b95b055e34da378ea470ef3f3caef Mon Sep 17 00:00:00 2001
From: Matthieu Moy <Matthieu.Moy@imag.fr>
Date: Sun, 8 Jul 2007 19:27:44 +0200
Subject: [PATCH] Change the behavior of git-rm to let it obey in more circumstances without -f.

In the previous behavior of git-rm, git refused to do anything in case of
a difference between the file on disk, the index, and the HEAD. As a
result, the -f flag is forced even for simple senarios like:

$ git add foo
$ git rm -f [--cached] foo

This patch proposes a saner behavior. When there are no difference at all
between file, index and HEAD, the file is removed both from the index and
the tree, as before.

Otherwise, if the index matches either the file on disk or the HEAD, the
file is removed from the index, but the file is kept on disk, it may
contain important data.

Otherwise, that's an error, and git-rm aborts.

The above senario becomes

$ git add foo
$ git rm foo

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
---
 Documentation/git-rm.txt |    9 +++--
 builtin-rm.c             |   71 ++++++++++++++++++++++++++++++++++++++--------
 2 files changed, 64 insertions(+), 16 deletions(-)

diff --git a/Documentation/git-rm.txt b/Documentation/git-rm.txt
index 78f45dc..180671c 100644
--- a/Documentation/git-rm.txt
+++ b/Documentation/git-rm.txt
@@ -11,10 +11,11 @@ SYNOPSIS
 
 DESCRIPTION
 -----------
-Remove files from the working tree and from the index.  The
-files have to be identical to the tip of the branch, and no
-updates to its contents must have been placed in the staging
-area (aka index).
+Remove files from the working tree and from the index. The content
+placed in the staging area (aka index) must match either the content
+of the file on disk, or the tip of the branch. If it matches only one
+of them, the file is kept on disk for safety, but is still removed
+from the index.
 
 
 OPTIONS
diff --git a/builtin-rm.c b/builtin-rm.c
index 4a0bd93..08af5de 100644
--- a/builtin-rm.c
+++ b/builtin-rm.c
@@ -12,20 +12,30 @@
 static const char builtin_rm_usage[] =
 "git-rm [-f] [-n] [-r] [--cached] [--ignore-unmatch] [--quiet] [--] <file>...";
 
+struct file_info {
+	const char *name;
+	int local_changes;
+	int staged_changes;
+};
+
 static struct {
 	int nr, alloc;
-	const char **name;
+	struct file_info *files;
 } list;
 
 static void add_list(const char *name)
 {
 	if (list.nr >= list.alloc) {
 		list.alloc = alloc_nr(list.alloc);
-		list.name = xrealloc(list.name, list.alloc * sizeof(const char *));
+		list.files = xrealloc(list.files, list.alloc * sizeof(struct file_info));
 	}
-	list.name[list.nr++] = name;
+	list.files[list.nr].name = name;
+	list.files[list.nr].local_changes  = 0;
+	list.files[list.nr].staged_changes = 0;
+	list.nr++;
 }
 
+/* Returns -1 on error, zero on success */
 static int remove_file(const char *name)
 {
 	int ret;
@@ -46,6 +56,30 @@ static int remove_file(const char *name)
 	return ret;
 }
 
+/* Returns 0 if the file was actually deleted, -1 if the file removal
+   was a failure, and 1 if remove_file wasn't actually called */
+static int remove_file_maybe(const struct file_info fi, int quiet)
+{
+	const char *path = fi.name;
+	if (!fi.local_changes && !fi.staged_changes)
+		/* The file matches either the index or the HEAD.
+		 * It's content exists somewhere else, it's safe to
+		 * delete it.
+		 */
+		return remove_file(path);
+	else {
+		if (!quiet)
+			fprintf(stderr, 
+				"note: file '%s' not removed "
+				"(%s).\n",
+				path,
+				fi.local_changes ? 
+				"the index doesn't match HEAD" :
+				"the file doesn't match the index");
+		return 1;
+	}
+}
+
 static int check_local_mod(unsigned char *head)
 {
 	/* items in list are already sorted in the cache order,
@@ -62,7 +96,7 @@ static int check_local_mod(unsigned char *head)
 		struct stat st;
 		int pos;
 		struct cache_entry *ce;
-		const char *name = list.name[i];
+		const char *name = list.files[i].name;
 		unsigned char sha1[20];
 		unsigned mode;
 
@@ -87,13 +121,17 @@ static int check_local_mod(unsigned char *head)
 			continue;
 		}
 		if (ce_match_stat(ce, &st, 0))
-			errs = error("'%s' has local modifications "
-				     "(hint: try -f)", ce->name);
+			list.files[i].local_changes = 1;
+
 		if (no_head
 		     || get_tree_entry(head, name, sha1, &mode)
 		     || ce->ce_mode != create_ce_mode(mode)
 		     || hashcmp(ce->sha1, sha1))
-			errs = error("'%s' has changes staged in the index "
+			list.files[i].staged_changes = 1;
+
+		if (list.files[i].local_changes && 
+		    list.files[i].staged_changes)
+			errs = error("'%s' doesn't match neither HEAD nor the index "
 				     "(hint: try -f)", name);
 	}
 	return errs;
@@ -108,6 +146,7 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
 	int ignore_unmatch = 0;
 	const char **pathspec;
 	char *seen;
+	int exit_status = 0;
 
 	git_config(git_default_config);
 
@@ -201,7 +240,7 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
 	 * the index unless all of them succeed.
 	 */
 	for (i = 0; i < list.nr; i++) {
-		const char *path = list.name[i];
+		const char *path = list.files[i].name;
 		if (!quiet)
 			printf("rm '%s'\n", path);
 
@@ -224,13 +263,21 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
 	if (!index_only) {
 		int removed = 0;
 		for (i = 0; i < list.nr; i++) {
-			const char *path = list.name[i];
-			if (!remove_file(path)) {
+			struct file_info file = list.files[i];
+			int status = remove_file_maybe(file, quiet);
+			if (status == 0) {
 				removed = 1;
 				continue;
+			} else if (status == 1) {
+				/* Let the user know from a script
+				 * that a file was not deleted on disk
+				 */
+				exit_status = 2;
+				continue;
 			}
 			if (!removed)
-				die("git-rm: %s: %s", path, strerror(errno));
+				die("git-rm: %s: %s", 
+				    file.name, strerror(errno));
 		}
 	}
 
@@ -240,5 +287,5 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
 			die("Unable to write new index file");
 	}
 
-	return 0;
+	return exit_status;
 }
-- 
1.5.3.rc0.64.gf4f4a-dirty



-- 
Matthieu

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [RFC][PATCH] Re: git-rm isn't the inverse action of git-add
  2007-07-08 20:34                         ` Matthieu Moy
@ 2007-07-08 21:49                           ` Johannes Schindelin
  2007-07-09  9:45                             ` Matthieu Moy
  2007-07-13 17:36                             ` Matthieu Moy
  0 siblings, 2 replies; 37+ messages in thread
From: Johannes Schindelin @ 2007-07-08 21:49 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: git, Jan Hudec, Yann Dirson, Christian Jaeger

Hi,

On Sun, 8 Jul 2007, Matthieu Moy wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> >> This patch proposes a saner behavior. When there are no difference at 
> >> all between file, index and HEAD, the file is removed both from the 
> >> index and the tree, as before.
> >> 
> >> Otherwise, if the index matches either the file on disk or the HEAD, 
> >> the file is removed from the index, but the file is kept on disk, it 
> >> may contain important data.
> >
> > However, if some of the files are of the first kind, and some are of 
> > the second kind, you happily apply with mixed strategies.  IMO that is 
> > wrong.
> 
> I'm not sure whether this is really wrong. The things git should
> really care about are the index and the repository itself, and the
> proposed behavior is consistant regarding that (either remove all
> files from the index, or remove none).

Well, I think it is wrong for the same reason as it is wrong to apply the 
changes to _any_ file when one would fail.  And since "git apply" shares 
my understanding, I think "git rm" should, too.

> >>  static struct {
> >>  	int nr, alloc;
> >> -	const char **name;
> >> +	struct file_info * files;
> >>  } list;
> >>  
> >>  static void add_list(const char *name)
> >>  {
> >>  	if (list.nr >= list.alloc) {
> >>  		list.alloc = alloc_nr(list.alloc);
> >> -		list.name = xrealloc(list.name, list.alloc * sizeof(const char *));
> >> +		list.files = xrealloc(list.files, list.alloc * sizeof(const char *));
> >
> > This is wrong, too.  Yes, it works.  But it really should be 
> > "sizeof(struct file_info *)".  Remember, code is also documentation.
> 
> You don't need to argue, that was a typo. My code is definitely wrong, 
> but you're wrong too ;-). That's actually sizeof(struct file_info).

Heh, right.

> >> +		if (!quiet)
> >> +			fprintf(stderr, 
> >> +				"note: file '%s' not removed "
> >> +				"(doesn't match %s).\n",
> >> +				path,
> >> +				fi.local_changes?"the index":"HEAD");
> >> +		return 0;
> >> +	}
> >> +}
> >
> > I suspect that this case does never fail. 0 means success for 
> > remove_file().  Not good.  You should at least have a way to ensure that 
> > it removed the files from the working tree from a script.  Otherwise there 
> > is not much point in returning a value to begin with.
> 
> I've changed it to have exit_status = 1 if git-rm aborted before
> starting, and 2 if git-rm skiped some file removals (and of course, 0
> if everything is done as expected).

Oh, so you do not take the return value of this function to determine if 
it has or has not done something with the files?  That's a bit confusing.

Besides, it would be all the more a reason for a test case, so that I can 
see that I am actually wrong.

> > Additionally, since this changes semantics, you better provide test 
> > cases to show what is expected to work, and _ensure_ that it actually 
> > works.
> 
> Sure. I forgot to mention it in my message, but I wanted to have 
> feedback before getting into the testsuite stuff.

I think it should be the other way.  If you change semantics with the 
patch, but another revision changes semantics _differently_, it is really 
easy to get lost.  In order to demonstrate what should be true, you have 
to provide examples.  And if you are already providing examples, just wrap 
them into

	test_description <description>
	. ./test-lib.sh

	...

	test_done

and prefix each test with "test_expect_success", and you're done.  It is 
really not something requiring a wizard.

> I'm posting the updated patch for info, but it should anyway not be
> merged until
> 
> * We agree on the behavior when different files have different kinds
>   of changes

I'd understand better what you wish to accomplish with the...

> * I add a testcase.

... testcase. So those are not two distinct points.

> >From f39ae646049b95b055e34da378ea470ef3f3caef Mon Sep 17 00:00:00 2001
> From: Matthieu Moy <Matthieu.Moy@imag.fr>
> Date: Sun, 8 Jul 2007 19:27:44 +0200
> Subject: [PATCH] Change the behavior of git-rm to let it obey in more circumstances without -f.

Please do not do this.

I meant to complain about your OP, but this time it is even worse.  The 
best way to guarantee that a patch gets lost in a thread is to move it _at 
the end_ of a reply.

Please follow the form that you change the subject, still reply, but but 
the quoted mail with your answers to that text between the "---" and the 
diffstat.

If that text is too long, you should use a separate email for the patch.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC][PATCH] Re: git-rm isn't the inverse action of git-add
  2007-07-08 21:49                           ` Johannes Schindelin
@ 2007-07-09  9:45                             ` Matthieu Moy
  2007-07-13 17:36                             ` Matthieu Moy
  1 sibling, 0 replies; 37+ messages in thread
From: Matthieu Moy @ 2007-07-09  9:45 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Jan Hudec, Yann Dirson, Christian Jaeger

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Hi,
>
> On Sun, 8 Jul 2007, Matthieu Moy wrote:
>
>> I'm not sure whether this is really wrong. The things git should
>> really care about are the index and the repository itself, and the
>> proposed behavior is consistant regarding that (either remove all
>> files from the index, or remove none).
>
> Well, I think it is wrong for the same reason as it is wrong to apply the 
> changes to _any_ file when one would fail.  And since "git apply" shares 
> my understanding, I think "git rm" should, too.

OK, let's say I'm convinced ;-).

>> > I suspect that this case does never fail. 0 means success for 
>> > remove_file().  Not good.  You should at least have a way to ensure that 
>> > it removed the files from the working tree from a script.  Otherwise there 
>> > is not much point in returning a value to begin with.
>> 
>> I've changed it to have exit_status = 1 if git-rm aborted before
>> starting, and 2 if git-rm skiped some file removals (and of course, 0
>> if everything is done as expected).
>
> Oh, so you do not take the return value of this function to determine if 
> it has or has not done something with the files?  That's a bit confusing.

I did, but previously, I kept the code that "die()"s if the first call
to remove_file() "fails". In remove_file_maybe(), not removing a file
because it's not sure it's safe to delete it is not a failure, so I
had to put a "return 0;" here to avoid the fatal error. My first patch
had return status !=0 if we tried to remove the file, and it failed. I
changed that.

> I meant to complain about your OP, but this time it is even worse.  The 
> best way to guarantee that a patch gets lost in a thread is to move it _at 
> the end_ of a reply.

I had posted the patch for info, but I did expect this one to get
lost, since it's definitely not complete.

I'll post an updated patch with a testcase and an appropriate subject
line within a few days (I don't have time right now).

Thanks,

-- 
Matthieu

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: git-rm isn't the inverse action of git-add
  2007-07-02 20:23   ` Christian Jaeger
  2007-07-02 20:40     ` Yann Dirson
@ 2007-07-11 12:20     ` Jakub Narebski
  2007-07-11 18:56       ` Jan Hudec
  1 sibling, 1 reply; 37+ messages in thread
From: Jakub Narebski @ 2007-07-11 12:20 UTC (permalink / raw)
  To: git

Christian Jaeger wrote:

> I don't per se require undo actions. I just don't understand why git-rm
> refuses to remove the file from the index, even if I didn't commit it.
> The index is just an intermediate record of the changes in my
> understandings, and the rm action would also be intermediate until it's
> being committed. And a non-committed action being deleted shouldn't need
> a special confirmation from me, especially not one which is consisting
> of a combination of two flags (of which one is a destructive one).

Should git-rm refuse to remove index entry if it is different from working
directory version or not?

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: git-rm isn't the inverse action of git-add
  2007-07-11 12:20     ` Jakub Narebski
@ 2007-07-11 18:56       ` Jan Hudec
  2007-07-11 21:26         ` Junio C Hamano
  0 siblings, 1 reply; 37+ messages in thread
From: Jan Hudec @ 2007-07-11 18:56 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 1245 bytes --]

On Wed, Jul 11, 2007 at 14:20:24 +0200, Jakub Narebski wrote:
> Christian Jaeger wrote:
> > I don't per se require undo actions. I just don't understand why git-rm
> > refuses to remove the file from the index, even if I didn't commit it.
> > The index is just an intermediate record of the changes in my
> > understandings, and the rm action would also be intermediate until it's
> > being committed. And a non-committed action being deleted shouldn't need
> > a special confirmation from me, especially not one which is consisting
> > of a combination of two flags (of which one is a destructive one).
> 
> Should git-rm refuse to remove index entry if it is different from working
> directory version or not?

IMHO it should refuse to remove index entry if it is different from both
working-tree version and versions in all parents.

If index matches any of that, but the working tree version does not match any
parent, the index entry should be removed (which currently isn't -- that's
the proposed change), but the file left in wokring tree. That would make
git-add + git-rm get you right back where you started, with nothing in index
and unversioned file in working tree.

-- 
						 Jan 'Bulb' Hudec <bulb@ucw.cz>

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: git-rm isn't the inverse action of git-add
  2007-07-11 18:56       ` Jan Hudec
@ 2007-07-11 21:26         ` Junio C Hamano
  0 siblings, 0 replies; 37+ messages in thread
From: Junio C Hamano @ 2007-07-11 21:26 UTC (permalink / raw)
  To: Jan Hudec; +Cc: Jakub Narebski, git

Jan Hudec <bulb@ucw.cz> writes:

> If index matches any of that, but the working tree version does not match any
> parent, the index entry should be removed (which currently isn't -- that's
> the proposed change), but the file left in wokring tree. That would make
> git-add + git-rm get you right back where you started, with nothing in index
> and unversioned file in working tree.

Don't think of 'rm' as inverse of 'add'.  That would only
confuse you.

A natural inverse of 'add' is 'un-add', and that operation is
called 'rm --cached', because we use that to name the option to
invoke an "index-only" variant of a command when the command can
operate on index and working tree file (e.g. "diff --cached",
"apply --cached").

A life of a file that does _not_ make into a commit goes like
this:

	[1]$ edit a-new-file

This is 'create', not 'add'.  git is not involved in this step.

	[2]$ git add a-new-file

This is 'add'; place an existing file in the index.  When you do
not want it in the index, you 'un-add' it.

	[3]$ git rm --cached a-new-file

This removes the entry from the index, without touching the
working tree file.  If you do not want that file at all (as
opposed to, "I am making a series of partial commits, and the
addition of this path does not belong to the first commit of the
series, so I am unstaging"), this is followed by

	[4]$ rm -f a-new-file

Again, git is not involved in this step.

The thing is, people sometimes want to have steps 3 and 4
combined, and it meshes well with the users' expectation when
they see the word "rm".  Think of "git rm" without "--cached" as
a shorthand to do 3 and 4 in one go to meet that expectation.

Obviously, we cannot usefully combine steps 1 and 2.  We could
have "git add --create a-new-file" launch an editor to create a
new file, but that would not be very useful in practice.

The fact that steps 3 and 4 can be naturally combined, but steps
1 and 2 cannot be, makes "add" and "rm" not inverse of each
other.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC][PATCH] Re: git-rm isn't the inverse action of git-add
  2007-07-08 21:49                           ` Johannes Schindelin
  2007-07-09  9:45                             ` Matthieu Moy
@ 2007-07-13 17:36                             ` Matthieu Moy
  2007-07-13 17:41                               ` [PATCH] More permissive "git-rm --cached" behavior without -f Matthieu Moy
  1 sibling, 1 reply; 37+ messages in thread
From: Matthieu Moy @ 2007-07-13 17:36 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Jan Hudec, Yann Dirson, Christian Jaeger

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>> > However, if some of the files are of the first kind, and some are of 
>> > the second kind, you happily apply with mixed strategies.  IMO that is 
>> > wrong.
>> 
>> I'm not sure whether this is really wrong. The things git should
>> really care about are the index and the repository itself, and the
>> proposed behavior is consistant regarding that (either remove all
>> files from the index, or remove none).
>
> Well, I think it is wrong for the same reason as it is wrong to apply the 
> changes to _any_ file when one would fail.  And since "git apply" shares 
> my understanding, I think "git rm" should, too.

OK, I've been thinking about it for some time (not having time to hack
can be good, it lets time for thinking instead ;-) ).

I'm actually still not convinced that my proposal was wrong, but I
think we disagree because we disagree on what is a "failure". I
consider leaving the file in the working tree to be just a safety
precaution, not a failure, and to me, it's OK to do that only for the
files that need it.

Fixing my patch by just "applying the same strategy to all files"
would be wrong: leaving _all_ the files on disk when just one has
local modifications is very misleading, and if the user notices it
after running the command, he or she does not always have an easy way
to get back to a clean situation (re-running the same command with -f
wouldn't work for example).

So, I went a shorter way from the current semantics:

* Allow --cached in more situations, so that -f is really needed in
  very particular situation (as I mentionned above, forcing -f too
  often means the -f gets hardcoded in the fingers, and makes it
  useless).

* Better error message, which points to --cached in addition to -f.

That's very close to what bzr does, BTW.

Drawback: it still doesn't solve the "rm isn't the inverse of add".

The patch is quite straightforward, and will be in a followup email.

-- 
Matthieu

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH] More permissive "git-rm --cached" behavior without -f.
  2007-07-13 17:36                             ` Matthieu Moy
@ 2007-07-13 17:41                               ` Matthieu Moy
  2007-07-13 17:57                                 ` Jeff King
                                                   ` (2 more replies)
  0 siblings, 3 replies; 37+ messages in thread
From: Matthieu Moy @ 2007-07-13 17:41 UTC (permalink / raw)
  To: git, Johannes Schindelin; +Cc: Matthieu Moy

In the previous behavior, "git-rm --cached" (without -f) had the same
restriction as "git-rm". This forced the user to use the -f flag in
situations which weren't actually dangerous, like:

$ git add foo           # oops, I didn't want this
$ git rm --cached foo   # back to initial situation

Previously, the index had to match the file *and* the HEAD. With
--cached, the index must now match the file *or* the HEAD. The behavior
without --cached is unchanged, but provides better error messages.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
---
 Documentation/git-rm.txt |    3 ++-
 builtin-rm.c             |   32 ++++++++++++++++++++++++++------
 t/t3600-rm.sh            |   34 ++++++++++++++++++++++++++++++++++
 3 files changed, 62 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-rm.txt b/Documentation/git-rm.txt
index 78f45dc..be61a82 100644
--- a/Documentation/git-rm.txt
+++ b/Documentation/git-rm.txt
@@ -14,7 +14,8 @@ DESCRIPTION
 Remove files from the working tree and from the index.  The
 files have to be identical to the tip of the branch, and no
 updates to its contents must have been placed in the staging
-area (aka index).
+area (aka index).  When --cached is given, the staged content has to
+match either the tip of the branch *or* the file on disk.
 
 
 OPTIONS
diff --git a/builtin-rm.c b/builtin-rm.c
index 4a0bd93..9a808c1 100644
--- a/builtin-rm.c
+++ b/builtin-rm.c
@@ -46,7 +46,7 @@ static int remove_file(const char *name)
 	return ret;
 }
 
-static int check_local_mod(unsigned char *head)
+static int check_local_mod(unsigned char *head, int index_only)
 {
 	/* items in list are already sorted in the cache order,
 	 * so we could do this a lot more efficiently by using
@@ -65,6 +65,8 @@ static int check_local_mod(unsigned char *head)
 		const char *name = list.name[i];
 		unsigned char sha1[20];
 		unsigned mode;
+		int local_changes = 0;
+		int staged_changes = 0;
 
 		pos = cache_name_pos(name, strlen(name));
 		if (pos < 0)
@@ -87,14 +89,32 @@ static int check_local_mod(unsigned char *head)
 			continue;
 		}
 		if (ce_match_stat(ce, &st, 0))
-			errs = error("'%s' has local modifications "
-				     "(hint: try -f)", ce->name);
+			local_changes = 1;
 		if (no_head
 		     || get_tree_entry(head, name, sha1, &mode)
 		     || ce->ce_mode != create_ce_mode(mode)
 		     || hashcmp(ce->sha1, sha1))
-			errs = error("'%s' has changes staged in the index "
-				     "(hint: try -f)", name);
+			staged_changes = 1;
+
+		if (local_changes && staged_changes)
+			errs = error("'%s' has staged content different "
+				     "from both the file and the HEAD\n"
+				     "(use -f to force removal)", name);
+		else if (!index_only) {
+			/* It's not dangerous to git-rm --cached a
+			 * file if the index matches the file or the
+			 * HEAD, since it means the deleted content is
+			 * still available somewhere.
+			 */
+			if (staged_changes)
+				errs = error("'%s' has changes staged in the index\n"
+					     "(use --cached to keep the file, "
+					     "or -f to force removal)", name);
+			if (local_changes)
+				errs = error("'%s' has local modifications\n"
+					     "(use --cached to keep the file, "
+					     "or -f to force removal)", name);
+		}
 	}
 	return errs;
 }
@@ -192,7 +212,7 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
 		unsigned char sha1[20];
 		if (get_sha1("HEAD", sha1))
 			hashclr(sha1);
-		if (check_local_mod(sha1))
+		if (check_local_mod(sha1, index_only))
 			exit(1);
 	}
 
diff --git a/t/t3600-rm.sh b/t/t3600-rm.sh
index 13a461f..5c001aa 100755
--- a/t/t3600-rm.sh
+++ b/t/t3600-rm.sh
@@ -46,6 +46,40 @@ test_expect_success \
     'git rm --cached foo'
 
 test_expect_success \
+    'Test that git rm --cached foo succeeds if the index matches the file' \
+    'echo content > foo
+     git add foo
+     git rm --cached foo'
+
+test_expect_success \
+    'Test that git rm --cached foo succeeds if the index matches the file' \
+    'echo content > foo
+     git add foo
+     git commit -m foo
+     echo "other content" > foo
+     git rm --cached foo'
+
+test_expect_failure \
+    'Test that git rm --cached foo fails if the index matches neither the file nor HEAD' \
+    'echo content > foo
+     git add foo
+     git commit -m foo
+     echo "other content" > foo
+     git add foo
+     echo "yet another content" > foo
+     git rm --cached foo'
+
+test_expect_success \
+    'Test that git rm --cached -f foo works in case where --cached only did not' \
+    'echo content > foo
+     git add foo
+     git commit -m foo
+     echo "other content" > foo
+     git add foo
+     echo "yet another content" > foo
+     git rm --cached -f foo'
+
+test_expect_success \
     'Post-check that foo exists but is not in index after git rm foo' \
     '[ -f foo ] && ! git ls-files --error-unmatch foo'
 
-- 
1.5.3.rc1.4.gaf83-dirty

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH] More permissive "git-rm --cached" behavior without -f.
  2007-07-13 17:41                               ` [PATCH] More permissive "git-rm --cached" behavior without -f Matthieu Moy
@ 2007-07-13 17:57                                 ` Jeff King
  2007-07-13 18:53                                   ` Matthieu Moy
  2007-07-14  0:44                                 ` Jakub Narebski
  2007-07-14  6:52                                 ` Junio C Hamano
  2 siblings, 1 reply; 37+ messages in thread
From: Jeff King @ 2007-07-13 17:57 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: git, Johannes Schindelin

On Fri, Jul 13, 2007 at 07:41:38PM +0200, Matthieu Moy wrote:

> Previously, the index had to match the file *and* the HEAD. With
> --cached, the index must now match the file *or* the HEAD. The behavior
> without --cached is unchanged, but provides better error messages.

This does make more sense, but there are still some inconsistencies. Is
it OK to lose content that is only in the index, or not?

If it is OK, then --cached shouldn't need _any_ safety valve (and after
all, anything you remove in that manner is recoverable with git-fsck
until the next prune).

If it isn't OK, then you are not addressing the cases where git-rm
without --cached loses index content (that is different than HEAD and
the working tree).

-Peff

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] More permissive "git-rm --cached" behavior without -f.
  2007-07-13 17:57                                 ` Jeff King
@ 2007-07-13 18:53                                   ` Matthieu Moy
  2007-07-14  3:42                                     ` Jeff King
  0 siblings, 1 reply; 37+ messages in thread
From: Matthieu Moy @ 2007-07-13 18:53 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Johannes Schindelin

Jeff King <peff@peff.net> writes:

> On Fri, Jul 13, 2007 at 07:41:38PM +0200, Matthieu Moy wrote:
>
>> Previously, the index had to match the file *and* the HEAD. With
>> --cached, the index must now match the file *or* the HEAD. The behavior
>> without --cached is unchanged, but provides better error messages.
>
> This does make more sense, but there are still some inconsistencies. Is
> it OK to lose content that is only in the index, or not?

I'd say it isn't OK. At least, that's what the previous git-rm
considered.

> If it is OK, then --cached shouldn't need _any_ safety valve (and after
> all, anything you remove in that manner is recoverable with git-fsck
> until the next prune).
>
> If it isn't OK, then you are not addressing the cases where git-rm
> without --cached loses index content (that is different than HEAD and
> the working tree).

Either I didn't understand your question, or the answer is "yes, I
do.". The behavior without --cached is not modified, except for the
error message, and the previous was to require -f whenever the index
doesn't match the head, *or* doesn't match the file. So, without
--cached, you need to have file=index=HEAD to be able to git-rm.

If I missunderstand you, please, provide a senario where my patch
doesn't do the expected.

-- 
Matthieu

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] More permissive "git-rm --cached" behavior without -f.
  2007-07-13 17:41                               ` [PATCH] More permissive "git-rm --cached" behavior without -f Matthieu Moy
  2007-07-13 17:57                                 ` Jeff King
@ 2007-07-14  0:44                                 ` Jakub Narebski
  2007-07-14  6:52                                 ` Junio C Hamano
  2 siblings, 0 replies; 37+ messages in thread
From: Jakub Narebski @ 2007-07-14  0:44 UTC (permalink / raw)
  To: git

Matthieu Moy wrote:

> In the previous behavior, "git-rm --cached" (without -f) had the same
> restriction as "git-rm". This forced the user to use the -f flag in
> situations which weren't actually dangerous, like:
> 
> $ git add foo           # oops, I didn't want this
> $ git rm --cached foo   # back to initial situation
> 
> Previously, the index had to match the file *and* the HEAD. With
> --cached, the index must now match the file *or* the HEAD. The behavior
> without --cached is unchanged, but provides better error messages.

Sensible.

There might be some discussion if what git-rm without --cached does
is right, but that is besides scope of this patch.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] More permissive "git-rm --cached" behavior without -f.
  2007-07-13 18:53                                   ` Matthieu Moy
@ 2007-07-14  3:42                                     ` Jeff King
  0 siblings, 0 replies; 37+ messages in thread
From: Jeff King @ 2007-07-14  3:42 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin

On Fri, Jul 13, 2007 at 08:53:38PM +0200, Matthieu Moy wrote:

> do.". The behavior without --cached is not modified, except for the
> error message, and the previous was to require -f whenever the index
> doesn't match the head, *or* doesn't match the file. So, without
> --cached, you need to have file=index=HEAD to be able to git-rm.
> 
> If I missunderstand you, please, provide a senario where my patch
> doesn't do the expected.

Right, my point was that there is a case where running without --cached
could lose content: when there is no working tree file. However,
thinking about it more, I recall that Junio made the point that allowing
that behavior means the CVS idiom of "rm file; git-rm file" will just
work.

Not that that was a problem you introduced; I merely wanted to push for
total consistency rather than just handling --cached. But I think the
non --cached behavior is actually right now, so let me retract my
complaint.

And assuming the "git-rm when no working tree file" current behavior is
OK, then I think your patch removes the last consistency problem that I
mentioned in my state table here:

  http://article.gmane.org/gmane.comp.version-control.git/51449

So in a round-about way, I totally approve of your patch. Sorry for the
confusion.

-Peff

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] More permissive "git-rm --cached" behavior without -f.
  2007-07-13 17:41                               ` [PATCH] More permissive "git-rm --cached" behavior without -f Matthieu Moy
  2007-07-13 17:57                                 ` Jeff King
  2007-07-14  0:44                                 ` Jakub Narebski
@ 2007-07-14  6:52                                 ` Junio C Hamano
  2007-07-14  7:16                                   ` Junio C Hamano
  2 siblings, 1 reply; 37+ messages in thread
From: Junio C Hamano @ 2007-07-14  6:52 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: git, Johannes Schindelin

Although I would not be using it often myself, I think this
would make "git rm" more pleasant to use.

Thanks for the patch, and my thanks also go to people who
commented on the patch.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] More permissive "git-rm --cached" behavior without -f.
  2007-07-14  6:52                                 ` Junio C Hamano
@ 2007-07-14  7:16                                   ` Junio C Hamano
  2007-07-14 10:14                                     ` Matthieu Moy
  0 siblings, 1 reply; 37+ messages in thread
From: Junio C Hamano @ 2007-07-14  7:16 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: git, Johannes Schindelin

Junio C Hamano <gitster@pobox.com> writes:

> Although I would not be using it often myself, I think this
> would make "git rm" more pleasant to use.
>
> Thanks for the patch, and my thanks also go to people who
> commented on the patch.

Having said that, I think this comment is not quite right.

+		else if (!index_only) {
+			/* It's not dangerous to git-rm --cached a
+			 * file if the index matches the file or the
+			 * HEAD, since it means the deleted content is
+			 * still available somewhere.
+			 */

Personally I do not think "rm --cached" needs any such "safety",
even though I'll keep the check for now, primarily because
loosening the restriction later is always easier than adding new
restriction.  I really do not think this is about protecting the
user from "deleted content is not available anywhere else".

In this sequence:

	edit a-new-file
	git add a-new-file
        edit a-new-file
        git add a-new-file

we do not complain, even though we are *losing* the contents we
earlier staged.  If you replace the second "git add" with
"git-rm --cached", the sequence should work the same way.  In
either case, you are working towards your next commit, and most
likely are doing a partial commit (iow, your working tree does
not match any of the commit you create in the middle).  Earlier
you thought you would want one state of the file in the next
commit, but now you decided against putting that new file in the
first commit in the series.  You may make further updates to the
index and would make a commit, but after making the commit, your
working tree still has "a-new-file" and you can add the contents
from it for the later commit.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] More permissive "git-rm --cached" behavior without -f.
  2007-07-14  7:16                                   ` Junio C Hamano
@ 2007-07-14 10:14                                     ` Matthieu Moy
  0 siblings, 0 replies; 37+ messages in thread
From: Matthieu Moy @ 2007-07-14 10:14 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Johannes Schindelin

Junio C Hamano <gitster@pobox.com> writes:

> Junio C Hamano <gitster@pobox.com> writes:
>
>> Although I would not be using it often myself, I think this
>> would make "git rm" more pleasant to use.
>>
>> Thanks for the patch, and my thanks also go to people who
>> commented on the patch.
>
> Having said that, I think this comment is not quite right.
>
> +		else if (!index_only) {
> +			/* It's not dangerous to git-rm --cached a
> +			 * file if the index matches the file or the
> +			 * HEAD, since it means the deleted content is
> +			 * still available somewhere.
> +			 */
>
> Personally I do not think "rm --cached" needs any such "safety",
> even though I'll keep the check for now, primarily because
> loosening the restriction later is always easier than adding new
> restriction.  I really do not think this is about protecting the
> user from "deleted content is not available anywhere else".

I agree that this is something you can argue about.

But in this case, the behavior without -f should be changed too. If
the file matches HEAD, then "git-rm file" should work, regardless of
the index then (but this situation is less frequent).

In any case, the situation where you might lose content in the index
by doing git-rm are rare: it means you started working on a file, did
"git-add" at least once, and edited the file again later, and then
decided you wanted to remove the file. So, requiring the -f flag in
those situation is not a real problem, even if the situation is
slightly-dangerous-but-not-quite-so.

I'm willing to work on another patch on top of this one if there's an
agreement on a better semantics. This one was about fixing something
which was IMHO wrong, but doesn't necessarily achieve perfection ;-).

-- 
Matthieu

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2007-07-14 10:14 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-07-02 18:09 git-rm isn't the inverse action of git-add Christian Jaeger
2007-07-02 19:42 ` Yann Dirson
2007-07-02 20:23   ` Christian Jaeger
2007-07-02 20:40     ` Yann Dirson
2007-07-02 20:54       ` Matthieu Moy
2007-07-02 21:05         ` Johannes Schindelin
2007-07-03 10:37           ` Matthieu Moy
2007-07-03 12:09             ` Johannes Schindelin
2007-07-03 13:40               ` Matthieu Moy
2007-07-03 14:21                 ` Johannes Schindelin
2007-07-04 20:08                 ` Jan Hudec
2007-07-05 13:44                   ` Matthieu Moy
2007-07-05 14:00                     ` David Kastrup
2007-07-08 17:36                     ` [RFC][PATCH] " Matthieu Moy
2007-07-08 18:10                       ` Johannes Schindelin
2007-07-08 20:34                         ` Matthieu Moy
2007-07-08 21:49                           ` Johannes Schindelin
2007-07-09  9:45                             ` Matthieu Moy
2007-07-13 17:36                             ` Matthieu Moy
2007-07-13 17:41                               ` [PATCH] More permissive "git-rm --cached" behavior without -f Matthieu Moy
2007-07-13 17:57                                 ` Jeff King
2007-07-13 18:53                                   ` Matthieu Moy
2007-07-14  3:42                                     ` Jeff King
2007-07-14  0:44                                 ` Jakub Narebski
2007-07-14  6:52                                 ` Junio C Hamano
2007-07-14  7:16                                   ` Junio C Hamano
2007-07-14 10:14                                     ` Matthieu Moy
2007-07-02 21:20       ` git-rm isn't the inverse action of git-add Christian Jaeger
2007-07-03  4:12         ` Jeff King
2007-07-03  4:47           ` Junio C Hamano
2007-07-03  4:59             ` Jeff King
2007-07-03  5:09               ` Junio C Hamano
2007-07-03  5:12                 ` Jeff King
2007-07-03  6:26                   ` Junio C Hamano
2007-07-11 12:20     ` Jakub Narebski
2007-07-11 18:56       ` Jan Hudec
2007-07-11 21:26         ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).