Re: why is git destructive by default? (i suggest it not be!)

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: why is git destructive by default? (i suggest it not be!)
@ 2008-06-24  4:59 Teemu Likonen
       [not found] ` <e80d075a0806232201o3933d154he2b570986604c30a@mail.gmail.com>
  0 siblings, 1 reply; 64+ messages in thread
From: Teemu Likonen @ 2008-06-24  4:59 UTC (permalink / raw)
  To: David Jeske; +Cc: git

David Jeske wrote (2008-06-24 01:47 -0000):

> As a new user, I'm finding git difficult to trust, because there are
> operations which are destructive by default and capable of
> inadvertently throwing hours or days of work into the bit bucket.

I'm also quite new and I actually feel safe using git, and it's because
of reflog. No matter what I do (except manual reflog expire) I can see
where I was before with command

  git log --walk-reflogs

and get everything back. I have needed it a couple of times. So the
safety net is there, one just has to learn to trust it. :-)

Git is much safer than the standard Unix tools like rm, mv and cp.

^ permalink raw reply	[flat|nested] 64+ messages in thread

[parent not found: <e80d075a0806232201o3933d154he2b570986604c30a@mail.gmail.com>]

* Re: why is git destructive by default? (i suggest it not be!)
       [not found] ` <e80d075a0806232201o3933d154he2b570986604c30a@mail.gmail.com>
@ 2008-06-24  5:43   ` Teemu Likonen
  0 siblings, 0 replies; 64+ messages in thread
From: Teemu Likonen @ 2008-06-24  5:43 UTC (permalink / raw)
  To: David Jeske; +Cc: git

David Jeske wrote (2008-06-23 22:01 -0700):

> On Mon, Jun 23, 2008 at 9:59 PM, Teemu Likonen <tlikonen@iki.fi>
> wrote:
> 
> > I'm also quite new and I actually feel safe using git, and it's
> > because of reflog. No matter what I do (except manual reflog expire)
> > I can see where I was before with command
> >
> >  git log --walk-reflogs
> 
> 
> Perhaps I'm misunderstanding how to read it, but how do you tell where
> a branch was from the reflog if you inadvertantly moved it?

Perhaps I'm misunderstanding what you mean but I try to explain. In git
branches are nothing but named pointers to certain commit. If you "move
a branch" you actually rename the pointer, nothing more. With command

  git log --walk-reflogs --all

you can see everything in your reflog. When branches are moved (i.e.,
renamed) in "git log --walk-reflogs" output it shows like this:

  commit 269f10bca2273c1cf77831d5e23c6e0361217697
  Reflog: refs/heads/master@{2008-06-24 08:15:56 +0300} (Teemu Likonen <tlikonen@iki.fi>)
  Reflog message: Branch: renamed refs/heads/master to refs/heads/testbranch
  Author: Teemu Likonen <tlikonen@iki.fi>
  Date:   2008-03-25 19:10:40 +0200

  [commit message]

See the "Reflog message" field above. It tells what happened. The
"Reflog" field tells when it happened. If I later remove this
"testbranch" I can browse the reflog and create this branch (i.e.
a pointer) again with command "git branch testbranch 269f10bca". The
269f10bca comes from the commit ID of the above log item.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
@ 2008-06-25 18:06 Dmitry Potapov
  0 siblings, 0 replies; 64+ messages in thread
From: Dmitry Potapov @ 2008-06-25 18:06 UTC (permalink / raw)
  To: David Jeske; +Cc: Brandon Casey, Jakub Narebski, Boaz Harrosh, git

On Tue, Jun 24, 2008 at 10:13:13PM -0000, David Jeske wrote:
> 
> Two things I'd like to make it easy for users to never do are:
> - delete data
> - cause refs to be dangling

Why? Let's suppose you work with CVS and you started to edit some
file and then realize than the change that you make is stupid, would
not you want just to discard these change without committing them to
CVS?

Perhaps, you are confused by thinking that git commit and cvs commit
are conceptually same commands. IMHO, far better to analogue to cvs
commit would be git push to a repository with denyNonFastForwards
policy. Git commit allows you to save your changes locally as series
of patches, but until you have not pushed them, they are not in a
permanent storage. You can change these patches, which implies that
old versions may become dangling and will be removed after reflog
expired. The idea of making of making difficult to remove some local
commits is alike the idea of an editor making difficult to remove a
line... You gain nothing from that. What editors do instead is to
provide the Undo action. Similarly, Git allows you to get back to an
old state using the reflog.

Dmitry

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
@ 2008-06-24 12:21 Olivier Galibert
  0 siblings, 0 replies; 64+ messages in thread
From: Olivier Galibert @ 2008-06-24 12:21 UTC (permalink / raw)
  To: David Jeske; +Cc: Jakub Narebski, Avery Pennarun, Nicolas Pitre, git

On Tue, Jun 24, 2008 at 11:29:43AM -0000, David Jeske wrote:
> -- Jakub Narebski wrote:
> > If they are using '-f', i.e. force, they should know and be sure what
> > they are doing; it is not much different from 'rm -f *'.
> 
> Sure, no problem. I don't want the ability to "rm -f *". I'm raising my hand
> and saying "I don't want the power to do these things, so just turn off all the
> git commands that could be destructive and give me an alternate way to do the
> workflows I need to do". Just like a normal user on a unix machine doesn't run
> around with the power to rm -f /etc all the time, even though they may be able
> to su to root.

But you still have the power to /bin/rm -rf ~, which tends to have
worse results.  The root/user separation just tries to protect the
system's integrity from the user.  This is similar to git, whch tries
to protect the repository's integrity, which is not the same thing as
the contents.

--force exists because it is sometimes useful.  It you block it behind
some config setting, whoever is concerned will just change the config
when he needs the command and never change it back.  And windows, fsck
and other things of the kind pretty much ruined the efficiency of
confirmations before dangerous/destructive operations.  So there isn't
much left beside engaging your brain before using --force on a
command.

  OG.

^ permalink raw reply	[flat|nested] 64+ messages in thread

[parent not found: <willow-jeske-01l5oEsvFEDjCjRW>]

[parent not found: <willow-jeske-01l5PFjPFEDjCfzf-01l5oEswFEDjCZBN>]

* Re: why is git destructive by default? (i suggest it not be!)
       [not found] ` <willow-jeske-01l5PFjPFEDjCfzf-01l5oEswFEDjCZBN>
@ 2008-06-24 10:42   ` David Jeske
  2008-06-24 15:29     ` Brandon Casey
  2008-06-24 10:42   ` David Jeske
  1 sibling, 1 reply; 64+ messages in thread
From: David Jeske @ 2008-06-24 10:42 UTC (permalink / raw)
  To: git

As a more practical question, how do I do this workflow illustrated below?

It's sort of similar to the workflow that "git stash" is trying to support,
except that I have a bunch of commits instead of a bunch of
uncommitted-changes.

I pull a repository that looks like this:

.  a<--b<--c  <--master

Then I hack away to this, and then throw my own branch on the end, along with
master:

.  a<--b<--c<--d<--e<--f<--g  <--master (jeske)
.                             <--feature1 (jeske)

While the server looks like this:

.  a<--b<--c<--1<--2<--3  <--master (server)

I want to get my repository to look something like this:

.  a<--b<--c<--1<--2<--3  <--master (jeske)
.           \
.            d<--e<--f<--g   <-- feature1 (jeske)

So I can then do this:

.  a<--b<--c<--1<--2<--3<--zz  <--master (jeske)
.           \
.            d<--e<--f<--g   <-- feature1 (jeske)

..and then push zz onto the server after 3.

..and I want to do it with safe commands that won't leave any dangling
references. (say if I forget to put the feature1 branch on)

How do I do that?

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
  2008-06-24 10:42   ` David Jeske
@ 2008-06-24 15:29     ` Brandon Casey
       [not found]       ` <willow-jeske-01l5PFjPFEDjCfzf-01l5uqS9FEDjCcuF>
  0 siblings, 1 reply; 64+ messages in thread
From: Brandon Casey @ 2008-06-24 15:29 UTC (permalink / raw)
  To: David Jeske; +Cc: git

David Jeske wrote:
> As a more practical question, how do I do this workflow illustrated below?
> 
> It's sort of similar to the workflow that "git stash" is trying to support,
> except that I have a bunch of commits instead of a bunch of
> uncommitted-changes.
> 
> I pull a repository that looks like this:
> 
> .  a<--b<--c  <--master

git clone <master_repo>
cd master_repo

> 
> Then I hack away to this, and then throw my own branch on the end, along with
> master:
> 
> .  a<--b<--c<--d<--e<--f<--g  <--master (jeske)
> .                             <--feature1 (jeske)

hack hack hack
git commit -a -m 'd'
hack hack hack
git commit -a -m 'e'
hack hack hack
git commit -a -m 'f'
hack hack hack
git commit -a -m 'g'
git branch feature1

> 
> While the server looks like this:
> 
> .  a<--b<--c<--1<--2<--3  <--master (server)

git fetch

> I want to get my repository to look something like this:
> 
> .  a<--b<--c<--1<--2<--3  <--master (jeske)
> .           \
> .            d<--e<--f<--g   <-- feature1 (jeske)

git reset --hard origin/master

Side Note: you probably should have been developing on 'feature1' branch
from the start. 'reset --hard' is a special case. If feature1 is a private
branch for developing in, you may want to rebase it ontop of master and retest
before merging into master and pushing so that you can maintain a nice linear
history when possible. Or you can just merge into master and then push.

> So I can then do this:
> 
> .  a<--b<--c<--1<--2<--3<--zz  <--master (jeske)
> .           \
> .            d<--e<--f<--g   <-- feature1 (jeske)

hack hack hack
git commit -a -m 'zz'

> 
> ..and then push zz onto the server after 3.

git push

> ..and I want to do it with safe commands that won't leave any dangling
> references. (say if I forget to put the feature1 branch on)

_Don't_ forget. 'reset --hard' is named that way for a reason. If you do
forget, git makes it _easy_ to recover from.

Let's say you _did_ forget. You did the 'reset --hard' on master and then
you committed the 'zz' change without creating the 'feature1' branch.
You can still create the feature1 branch since git saved the previous state
in the reflog. It is two changes back.

git branch feature1 master@{2}

If you didn't know it was two changes back, then you can look through the
reflog using 'git log -g master'. The commit message is there along with a
reflog message describing what action was performed.

After saying all of that, here is how I think you _should_ have done things.
Notice I _did_not_ use 'reset --hard'.

git clone <master_repo>
cd master_repo
git checkout -b feature1   # we create our feature branch immediately since
                           # creating branches is so effortless in git. A
                           # private feature branch should _always_ be created
                           # and used for development.
hack hack hack
git commit -a -m 'd'       # Make our 4 commits on the feature branch
hack hack hack
git commit -a -m 'e'
hack hack hack
git commit -a -m 'f'
hack hack hack
git commit -a -m 'g'
git checkout master         # Let's go back to master
git pull                    # Fetch and merge the changes from the server
git checkout -b 'master_zz' # Create a branch for developing the zz feature
hack hack hack
git commit -a -m 'zz'       # Commit the zz feature
git checkout master         # Go back to master
git merge master_zz         # Merge zz
git push                    # And push master out
git branch -d master_zz     # Now we're done with master_zz since it's all merged in

Now you're in the same place you were above, you can continue developing your feature
on feature1 branch by checking it out. This is also were rebase comes in handy, since
you may want to rebase feature1 on top of the new current master. Once it is done and
retested, you merge it into master and push it out.

-brandon

^ permalink raw reply	[flat|nested] 64+ messages in thread

[parent not found: <willow-jeske-01l5PFjPFEDjCfzf-01l5uqS9FEDjCcuF>]

* Re: why is git destructive by default? (i suggest it not be!)
       [not found]       ` <willow-jeske-01l5PFjPFEDjCfzf-01l5uqS9FEDjCcuF>
@ 2008-06-24 16:41         ` David Jeske
  2008-06-24 18:55           ` Brandon Casey
                             ` (2 more replies)
  2008-06-24 16:41         ` David Jeske
  1 sibling, 3 replies; 64+ messages in thread
From: David Jeske @ 2008-06-24 16:41 UTC (permalink / raw)
  To: Brandon Casey; +Cc: git

My takeaways from this thread:

- THANKS! to all of you for the detailed discussion, and for making git. Even
though it's still unfamiliar to me, I really enjoy (g)it!

- I don't think anyone here thinks git is beyond improvement. This discussion
did change my mind on a few things since my original post. I started this
discussion to share my "unacclimated usability suggestions", because after I
acclimate to git, I'll be telling new users that these idiosyncrasies are all
no big deal too.  :) I still think there is value in this list of suggestions.
I'll work on submitting patches...

- improve the man page description of "reset --hard" (see below)
- standardize all the potentially destructive operations (after gc) on "-f /
--force" to override
- add "checkout" to the git-gui history right-click menu, and make the danger
of
"reset --hard" more obvious and require a confirmation dialog (the gui
equivilant of -f)

----------

a couple more specific responses below..

-- Rogan Dawes wrote:
> -- David wrote:
> > Let me guess, you're always running euid==0. :)
> Do you also ask the gnu coreutils folks to remove the -f option from their
utilities?

-- Johannes Gilger wrote:
> I think the name of the command "reset" itself is a name which should
> prompt everyone to read a manpage before using it. [snip ]
> Nobody complains about rm --force or anything.

Isn't it nice that they standardized on "-f" and "--force" across ALL commands?

I would be inclined to talk to coreutils if it was "rm -f", "cp -R" (vs cp -r),
and "mv --aggressive" to do the respective non-safe versions.

It would simplify git's command-line-ui and cognitive load if it did the same
thing. Pick one standard for "overriding dangerous commands", instead of
"danger caps" and "danger --reset" and "danger -f". Consider branch which has
both "branch -[MD]" and "branch -f" in the same subcommand. What's wrong with
"branch -[md] -f"?

Of course --hard encourages one to read the manpage. However, git is using a
bunch of new terms for things, and uses at least those three different methods
to indicate command danger. Lets look at the working on the manpage:

"Matches the working tree and index to that of the tree being
switched
to. Any changes to tracked files in the working tree since <commit>
are lost."

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

I interpreted this as "any [non committed changes] to tracked files in the
working tree since <commit> are lost."  I don't this this was a naive
interpretation. I still think that's the way it reads after this whole
conversation.

I'll work on my first patch for git:

-> "References to any working tree changes, and pulled changes, AND COMMITTED
CHANGES to tracked files in the branch after <commit> will be dropped, causing
them to be removed at the next garbage collect.".

-- Brandon Casey wrote:
> After saying all of that, here is how I think you _should_ have done things.
> Notice I _did_not_ use 'reset --hard'.

I was told that I can safely do "git checkout origin/master" instead of "reset
--hard" to get back to the pull point, in case I didn't branch ahead of time.
The wrinkle being that my "master" branch-pointer still points to my local
changes, so I need to move onto a different branchname before I push if I want
to avoid those changes going to the server, which is fine..

> git clone <master_repo>
> cd master_repo
> git checkout -b feature1 # we create our feature branch immediately since
> # creating branches is so effortless in git. A
> # private feature branch should _always_ be created
> # and used for development.

I'm beginning to see why I would always work this way, though if "private
feature branches should always be created and used for development", then I'm
unclear about why this isn't the default. git could implicitly create them when
I checkin a change on the head of a pulled branch. (i.e. user/branchname/id, or
something else). I'm reaching here, I'll need to use git more with other
developers to understand this better.

-------------------
Thanks again for all the detailed responses and explanations!

- David

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
  2008-06-24 16:41         ` David Jeske
@ 2008-06-24 18:55           ` Brandon Casey
  2008-06-25 12:20           ` Matthieu Moy
  2008-06-25 17:56           ` Jing Xue
  2 siblings, 0 replies; 64+ messages in thread
From: Brandon Casey @ 2008-06-24 18:55 UTC (permalink / raw)
  To: David Jeske; +Cc: git

David Jeske wrote:
> My takeaways from this thread:
> 
> - THANKS! to all of you for the detailed discussion, and for making git. Even
> though it's still unfamiliar to me, I really enjoy (g)it!
> 
> - I don't think anyone here thinks git is beyond improvement. This discussion
> did change my mind on a few things since my original post. I started this
> discussion to share my "unacclimated usability suggestions", because after I
> acclimate to git, I'll be telling new users that these idiosyncrasies are all
> no big deal too.  :) I still think there is value in this list of suggestions.
> I'll work on submitting patches...
> 
> - improve the man page description of "reset --hard" (see below)
> - standardize all the potentially destructive operations (after gc) on "-f /
> --force" to override

The thing is 'force' is not always the most descriptive word for the behavior
that you propose enabling with --force.

For the reset command in particular there is a --soft counterpart to --hard. They
are both modifiers on the term 'reset' i.e. a 'soft reset' or a 'hard reset'. The
default is wbat is called a 'mixed reset'.

'gc' is another command that has been mentioned along with its '--aggressive' option.
--force does not seem to make sense here either, since we are not necessarily forcing
anything to happen in the sense of overriding some safe guard. What is happening is
that possibly more cpu-intensive options are being selected when repacking (compressing)
the repository.

> Consider branch which has
> both "branch -[MD]" and "branch -f" in the same subcommand. What's wrong with
> "branch -[md] -f"?

I am inclined to agree here. I'm not sure why the options for 'git branch' were
created this way. I too have thought that a -f modifier on -m and -d would be
more intuitive.

> Of course --hard encourages one to read the manpage. However, git is using a
> bunch of new terms for things, and uses at least those three different methods
> to indicate command danger. Lets look at the working on the manpage:
> 
> "Matches the working tree and index to that of the tree being
> switched
> to. Any changes to tracked files in the working tree since <commit>
> are lost."
> 
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> I interpreted this as "any [non committed changes] to tracked files in the
> working tree since <commit> are lost."  I don't this this was a naive
> interpretation. I still think that's the way it reads after this whole
> conversation.

I think the reason it only says that uncommitted changes are lost is because
the committed changes are not lost even though they may become unreachable
from the head of the current branch. They are still reachable at least from
the reflog, so they are not lost. The uncommitted changes _are_ lost and are
unrecoverable.

> I'll work on my first patch for git:
> 
> -> "References to any working tree changes, and pulled changes, AND COMMITTED
> CHANGES to tracked files in the branch after <commit> will be dropped, causing
> them to be removed at the next garbage collect.".

Uncommited working tree changes are gone immediately. Anything that has already
been committed will be garbage collected only after it is not referenced by
anything else in the repository. A reference will be maintained in the reflog
for at least 30 days (by default).

> 
> -- Brandon Casey wrote:
>> After saying all of that, here is how I think you _should_ have done things.
>> Notice I _did_not_ use 'reset --hard'.
> 
> I was told that I can safely do "git checkout origin/master" instead of "reset
> --hard" to get back to the pull point, in case I didn't branch ahead of time.

I think 'git checkout origin/master' would be a little odd since this is usually
a remote tracking branch. 'git checkout -b mymaster origin/master' or similar
would be more common. This creates a new branch named 'mymaster'.

-brandon

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
  2008-06-24 16:41         ` David Jeske
  2008-06-24 18:55           ` Brandon Casey
@ 2008-06-25 12:20           ` Matthieu Moy
  2008-06-25 17:56           ` Jing Xue
  2 siblings, 0 replies; 64+ messages in thread
From: Matthieu Moy @ 2008-06-25 12:20 UTC (permalink / raw)
  To: David Jeske; +Cc: Brandon Casey, git

"David Jeske" <jeske@google.com> writes:

> - standardize all the potentially destructive operations (after gc) on "-f /
> --force" to override

Depending on the definition of "potentially destructive", most
commands are "potentially destructive".

git pull loses the point where the branch used to point when the
reflog expires.

git add loses the old content of the index.

...

And adding too many --force options removes its real value. Many
people type "rm -fr" any time they just want "rm", just because they
were annoyed by the multiple interactive confirmations of plain "rm"
(if aliased to "rm -i"). Asking people to type --force all the time
make one fingers type --force mechanically, and removes all its value.

-- 
Matthieu

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
  2008-06-24 16:41         ` David Jeske
  2008-06-24 18:55           ` Brandon Casey
  2008-06-25 12:20           ` Matthieu Moy
@ 2008-06-25 17:56           ` Jing Xue
  2 siblings, 0 replies; 64+ messages in thread
From: Jing Xue @ 2008-06-25 17:56 UTC (permalink / raw)
  To: David Jeske; +Cc: git

Quoting David Jeske <jeske@google.com>:

> - add "checkout" to the git-gui history right-click menu, and make the
> danger of
> "reset --hard" more obvious and require a confirmation dialog (the gui
> equivilant of -f)

Is that really necessary?  The way it works now, when I choose "reset  
foo branch to here", a dialog prompts me to pick from the three reset  
modes, with 'Mixed' being the default. So I'd have to explicitly pick  
'Hard', which has a message "discards ALL local changes" right next to  
it.  If people are so conditioned to ignore that, I doubt it'll take  
very long for them to be conditioned to just automatically confirm the  
confirmation dialog.

The same applies to the command line as well I guess - if having to  
manually type "--hard" does not make one stop and think about what  
they are doing, I can hardly see how "--hard --force" would do any  
better.

Cheers.
-- 
Jing Xue

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
       [not found]       ` <willow-jeske-01l5PFjPFEDjCfzf-01l5uqS9FEDjCcuF>
  2008-06-24 16:41         ` David Jeske
@ 2008-06-24 16:41         ` David Jeske
  1 sibling, 0 replies; 64+ messages in thread
From: David Jeske @ 2008-06-24 16:41 UTC (permalink / raw)
  To: Brandon Casey; +Cc: git

My takeaways from this thread:

- THANKS! to all of you for the detailed discussion, and for making git. Even
though it's still unfamiliar to me, I really enjoy (g)it!

- I don't think anyone here thinks git is beyond improvement. This discussion
did change my mind on a few things since my original post. I started this
discussion to share my "unacclimated usability suggestions", because after I
acclimate to git, I'll be telling new users that these idiosyncrasies are all
no big deal too.  :) I still think there is value in this list of suggestions.
I'll work on submitting patches...

- improve the man page description of "reset --hard" (see below)
- standardize all the potentially destructive operations (after gc) on "-f /
--force" to override
- add "checkout" to the git-gui history right-click menu, and make the danger
of
"reset --hard" more obvious and require a confirmation dialog (the gui
equivilant of -f)

----------

a couple more specific responses below..

-- Rogan Dawes wrote:
> -- David wrote:
> > Let me guess, you're always running euid==0. :)
> Do you also ask the gnu coreutils folks to remove the -f option from their
utilities?

-- Johannes Gilger wrote:
> I think the name of the command "reset" itself is a name which should
> prompt everyone to read a manpage before using it. [snip ]
> Nobody complains about rm --force or anything.

Isn't it nice that they standardized on "-f" and "--force" across ALL commands?

I would be inclined to talk to coreutils if it was "rm -f", "cp -R" (vs cp -r),
and "mv --aggressive" to do the respective non-safe versions.

It would simplify git's command-line-ui and cognitive load if it did the same
thing. Pick one standard for "overriding dangerous commands", instead of
"danger caps" and "danger --reset" and "danger -f". Consider branch which has
both "branch -[MD]" and "branch -f" in the same subcommand. What's wrong with
"branch -[md] -f"?

Of course --hard encourages one to read the manpage. However, git is using a
bunch of new terms for things, and uses at least those three different methods
to indicate command danger. Lets look at the working on the manpage:

"Matches the working tree and index to that of the tree being
switched
to. Any changes to tracked files in the working tree since <commit>
are lost."

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

I interpreted this as "any [non committed changes] to tracked files in the
working tree since <commit> are lost."  I don't this this was a naive
interpretation. I still think that's the way it reads after this whole
conversation.

I'll work on my first patch for git:

-> "References to any working tree changes, and pulled changes, AND COMMITTED
CHANGES to tracked files in the branch after <commit> will be dropped, causing
them to be removed at the next garbage collect.".

-- Brandon Casey wrote:
> After saying all of that, here is how I think you _should_ have done things.
> Notice I _did_not_ use 'reset --hard'.

I was told that I can safely do "git checkout origin/master" instead of "reset
--hard" to get back to the pull point, in case I didn't branch ahead of time.
The wrinkle being that my "master" branch-pointer still points to my local
changes, so I need to move onto a different branchname before I push if I want
to avoid those changes going to the server, which is fine..

> git clone <master_repo>
> cd master_repo
> git checkout -b feature1 # we create our feature branch immediately since
> # creating branches is so effortless in git. A
> # private feature branch should _always_ be created
> # and used for development.

I'm beginning to see why I would always work this way, though if "private
feature branches should always be created and used for development", then I'm
unclear about why this isn't the default. git could implicitly create them when
I checkin a change on the head of a pulled branch. (i.e. user/branchname/id, or
something else). I'm reaching here, I'll need to use git more with other
developers to understand this better.

-------------------
Thanks again for all the detailed responses and explanations!

- David

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
       [not found] ` <willow-jeske-01l5PFjPFEDjCfzf-01l5oEswFEDjCZBN>
  2008-06-24 10:42   ` David Jeske
@ 2008-06-24 10:42   ` David Jeske
  1 sibling, 0 replies; 64+ messages in thread
From: David Jeske @ 2008-06-24 10:42 UTC (permalink / raw)
  To: git

As a more practical question, how do I do this workflow illustrated below?

It's sort of similar to the workflow that "git stash" is trying to support,
except that I have a bunch of commits instead of a bunch of
uncommitted-changes.

I pull a repository that looks like this:

.  a<--b<--c  <--master

Then I hack away to this, and then throw my own branch on the end, along with
master:

.  a<--b<--c<--d<--e<--f<--g  <--master (jeske)
.                             <--feature1 (jeske)

While the server looks like this:

.  a<--b<--c<--1<--2<--3  <--master (server)

I want to get my repository to look something like this:

.  a<--b<--c<--1<--2<--3  <--master (jeske)
.           \
.            d<--e<--f<--g   <-- feature1 (jeske)

So I can then do this:

.  a<--b<--c<--1<--2<--3<--zz  <--master (jeske)
.           \
.            d<--e<--f<--g   <-- feature1 (jeske)

..and then push zz onto the server after 3.

..and I want to do it with safe commands that won't leave any dangling
references. (say if I forget to put the feature1 branch on)

How do I do that?

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
@ 2008-06-24  8:35 Björn Steinbrink
  0 siblings, 0 replies; 64+ messages in thread
From: Björn Steinbrink @ 2008-06-24  8:35 UTC (permalink / raw)
  To: David Jeske; +Cc: Jakub Narebski, Avery Pennarun, Nicolas Pitre, git

On 2008.06.24 08:08:13 -0000, David Jeske wrote:
> To re-ask the same question I asked in my last post, using your ascii
> pictures...
> 
> 
> Let's assume we're here..
> 
> .<---.<---.<---A<---X<---Y    <--- master
> \
> \--B<---C    <--- customer_A_branch <=== HEAD
> 
> 
> And this person and everyone else moves their head pointers back to master
> without merging:
> 
> 
> .<---.<---.<---A<---X<---Y    <--- master              <=== HEAD
> \
> \--B<---C    <--- customer_A_branch
> 
> 
> Now, five years down the road, our tree looks like:
> 
> 
> .<---A<---X<---Y<---.<--.<--.(3 years of changes)<---ZZZ<--- master  <=== HEAD
> \
> \--B<---C   <--- customer_A_branch
> 
> And someone does:
> 
> git-branch -f customer_A_branch ZZZ
> 
> To bring us to:
> 
> .<---A<---X<---Y<---.<--.(3 years of changes)<---ZZZ<--- master  <=== HEAD
> \                                           \
> \--B<---C                                   \-- customer_A_branch
> 
> 
> ..at this point, will a GC keep "B<--C", or garbage collect the commits and
> throw them away?

That would throw away the changes in _that_ repository after the reflog
entry has expired. It would not affect any other repo yet, and if that
developer tries to push that new customer_A_branch, it would be refused,
because it is not a fast-forward. And if the repo he's trying to push to
simply doesn't allow any non-fast-forward pushes, then even push -f
won't help him to destroy anything.

Björn

^ permalink raw reply	[flat|nested] 64+ messages in thread

[parent not found: <jeske@willow=01l5V7waFEDjChmh>]

[parent not found: <willow-jeske-01l5PFjPFEDjCfzf-01l5V7wbFEDjCX7V>]

* Re: why is git destructive by default? (i suggest it not be!)
       [not found] ` <willow-jeske-01l5PFjPFEDjCfzf-01l5V7wbFEDjCX7V>
@ 2008-06-24  1:47   ` David Jeske
  2008-06-24 17:11     ` Boaz Harrosh
  2008-06-24  1:47   ` David Jeske
  1 sibling, 1 reply; 64+ messages in thread
From: David Jeske @ 2008-06-24  1:47 UTC (permalink / raw)
  To: git

As a new user, I'm finding git difficult to trust, because there are operations
which are destructive by default and capable of inadvertently throwing hours or
days of work into the bit bucket.

More problematic, those commands have no discernible pattern that shows their
danger, and they need to be used to do typical everyday things. I'm starting to
feel like I need to use another source control system on top of the git
repository in case I make a mistake.  My philosophy is simple, I never never
never want to throw away changes, you shouldn't either. Disks are cheaper than
programmer hours. I can understand wanting to keep things tidy, so I can
understand ways to correct the 'easily visible changes', and also avoid pushing
them to other trees, but I don't understand why git needs to delete things.

For example, the following commands seem capable of totally destroying hours or
days of work. Some of them need to be used regularly to do everyday things, and
there is no pattern among them spelling out danger.

git reset --hard          : if another branch name hasn't been created
git rebase
git branch -D <branch>    : if branch hasn't been merged
git branch -f <new>       : if new exists and hasn't been merged
git branch -m <old> <new> : if new exists and hasn't been merged

I've heard from a couple users that the solution to these problems is to "go
dig what you need out of the log, it's still in there". However, it's only in
there until the log is garbage collected. This either means they are
destructive operations, or we expect "running without ever collecting the log"
to be a valid mode of operation... which I doubt is the case.

Question: How about assuring ALL operations can be done non-destructivly by
default? Then make destructive things require an explicit action that follows a
common pattern.

Suggestion Illustration
-----------------------

Below is one illustration of how these commands could be changed to be entirely
non-destructive, while retaining the current functionality. It also allows you
to destroy stuff if you have lawyers breathing down your neck, or really really
can't afford the hard drive space for a couple lines of text (though I'll
personally make a donation to anyone in this state!) :)

1) Require the "--destroy" flag for ANY git operation which is capable of
destroying data such that it is unrecoverable. A narrow view of this is to only
consider checked-in repository data, and not metadata, such as the location of
a branchname. However, the broad view would be to include all/most metadata.

2) Make a pattern for branch names which are kept in the local tree, not
included in push/pull, not modifiable without first renaming, and not shown by
default when viewing all branch history. For example, "local-<date>-*"

3) make 'git reset --hard <commit>' safe

Automatically commit working set and make a branch name (if necessary) to avoid
changes being thrown away. The branch name could be of the form
"local-<date>-reset-<user>-<date>". If the user really wants to destroy it,
they could use the dangerous version "git reset --hard --destroy", or they
could just "git branch -d --destroy <branchname>" afterwords. Most users would
do neither.

4) make 'git rebase' safe

'rebase' would make a branch name before performing its operation, assuring it
was easy to get back to the previous state. Currently, "git rebase" turns this:

A---B---C topic
/
D---E---F---G master

Into this:

A'--B'--C' topic
/
D---E---F---G master

.. and in turn destroys the original changes. It would instead create this:

A--B--C (x)    A'--B'--C' (y)
/              /
D---E------F-------G master

(x) - local-<date>-rebase-topic-<commit for G>
(y) - topic

5) make 'git branch' follow rule 1 above (safe without --destroy)

Using any of the following commands without --destroy would cause them to
create a branch "local-<date>-rename-<old branch name>", to prevet the
destruction of the old branch location:

git branch -d <branchname>
git branch -M <old> <new>
git branch -f <branchname>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
  2008-06-24  1:47   ` David Jeske
@ 2008-06-24 17:11     ` Boaz Harrosh
  2008-06-24 17:19       ` Boaz Harrosh
  2008-06-24 18:18       ` Brandon Casey
  0 siblings, 2 replies; 64+ messages in thread
From: Boaz Harrosh @ 2008-06-24 17:11 UTC (permalink / raw)
  To: David Jeske; +Cc: git

David Jeske wrote:
> As a new user, I'm finding git difficult to trust, because there are operations
> which are destructive by default and capable of inadvertently throwing hours or
> days of work into the bit bucket.
> 
> More problematic, those commands have no discernible pattern that shows their
> danger, and they need to be used to do typical everyday things. I'm starting to
> feel like I need to use another source control system on top of the git
> repository in case I make a mistake.  My philosophy is simple, I never never
> never want to throw away changes, you shouldn't either. Disks are cheaper than
> programmer hours. I can understand wanting to keep things tidy, so I can
> understand ways to correct the 'easily visible changes', and also avoid pushing
> them to other trees, but I don't understand why git needs to delete things.
> 
> For example, the following commands seem capable of totally destroying hours or
> days of work. Some of them need to be used regularly to do everyday things, and
> there is no pattern among them spelling out danger.
> 
> git reset --hard          : if another branch name hasn't been created

git reset --hard is special see below

> git rebase
> git branch -D <branch>    : if branch hasn't been merged
> git branch -f <new>       : if new exists and hasn't been merged
> git branch -m <old> <new> : if new exists and hasn't been merged
> 
The rest of the commands are recoverable from the log as people said
but "git reset --hard" is not and should be *fixed*!

I use git reset --hard in to separate and distinct functions.
One - to move current branch head around from place to place.
Two - Throw away work I've edited

It has happened to me more then once that I wanted the first
and also got the second as an un-warned bonus, to the dismay 
of my bosses. (What do I care if I need to write all this code
again)

I would like git-reset --hard to refuse if a git-diff HEAD
(both staged and unstaged) is not empty. with a -f / -n logic
like git-clean. (like git-clean none default config file override)

Now I know that the first usage above could be done with
git-branch -f that_branch the_other_branch. But that can
not be preformed on the current branch and local changes
are not lost.

Lots of other potentially destructive git-commands check for local
changes and refuse to operate. To remedy them git-reset --hard
is recommended. I would prefer if there was a git-reset --clean -f/-n
for the first case and git reset --hard only for the second usage
case.

My $0.017
Boaz

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
  2008-06-24 17:11     ` Boaz Harrosh
@ 2008-06-24 17:19       ` Boaz Harrosh
  2008-06-24 19:08         ` Jakub Narebski
  2008-06-24 18:18       ` Brandon Casey
  1 sibling, 1 reply; 64+ messages in thread
From: Boaz Harrosh @ 2008-06-24 17:19 UTC (permalink / raw)
  To: David Jeske; +Cc: git

Boaz Harrosh wrote:
> David Jeske wrote:
>> As a new user, I'm finding git difficult to trust, because there are operations
>> which are destructive by default and capable of inadvertently throwing hours or
>> days of work into the bit bucket.
>>
>> More problematic, those commands have no discernible pattern that shows their
>> danger, and they need to be used to do typical everyday things. I'm starting to
>> feel like I need to use another source control system on top of the git
>> repository in case I make a mistake.  My philosophy is simple, I never never
>> never want to throw away changes, you shouldn't either. Disks are cheaper than
>> programmer hours. I can understand wanting to keep things tidy, so I can
>> understand ways to correct the 'easily visible changes', and also avoid pushing
>> them to other trees, but I don't understand why git needs to delete things.
>>
>> For example, the following commands seem capable of totally destroying hours or
>> days of work. Some of them need to be used regularly to do everyday things, and
>> there is no pattern among them spelling out danger.
>>
>> git reset --hard          : if another branch name hasn't been created
> 
> git reset --hard is special see below
> 
>> git rebase
>> git branch -D <branch>    : if branch hasn't been merged
>> git branch -f <new>       : if new exists and hasn't been merged
>> git branch -m <old> <new> : if new exists and hasn't been merged
>>
> The rest of the commands are recoverable from the log as people said
> but "git reset --hard" is not and should be *fixed*!
> 
> I use git reset --hard in to separate and distinct functions.
> One - to move current branch head around from place to place.
> Two - Throw away work I've edited
> 
> It has happened to me more then once that I wanted the first
> and also got the second as an un-warned bonus, to the dismay 
> of my bosses. (What do I care if I need to write all this code
> again)
> 
> I would like git-reset --hard to refuse if a git-diff HEAD
> (both staged and unstaged) is not empty. with a -f / -n logic
> like git-clean. (like git-clean none default config file override)
> 
> Now I know that the first usage above could be done with
> git-branch -f that_branch the_other_branch. But that can
> not be preformed on the current branch and local changes
> are not lost.
> 
> Lots of other potentially destructive git-commands check for local
> changes and refuse to operate. To remedy them git-reset --hard
> is recommended. I would prefer if there was a git-reset --clean -f/-n
> for the first case and git reset --hard only for the second usage
> case.
Sorry
git-reset --clean -f/-n for removing local changes
git reset --hard for moving HEAD on a clean tree only
> 
> My $0.017
> Boaz
> 

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
  2008-06-24 17:19       ` Boaz Harrosh
@ 2008-06-24 19:08         ` Jakub Narebski
       [not found]           ` <willow-jeske-01l5PFjPFEDjCfzf-01l5zrLdFEDjCV3U>
  2008-06-25  8:57           ` Boaz Harrosh
  0 siblings, 2 replies; 64+ messages in thread
From: Jakub Narebski @ 2008-06-24 19:08 UTC (permalink / raw)
  To: Boaz Harrosh; +Cc: David Jeske, git

Boaz Harrosh <bharrosh@panasas.com> writes:

> Sorry
> git-reset --clean -f/-n for removing local changes
> git reset --hard for moving HEAD on a clean tree only

Wouldn't "git reset <commit-ish>" be enough then?  It modifies where
current branch points to (as opposed to git-checkout modifying what is
the current branch), and it modifies index.  What it doesn't modify is
working directory, but it is clean already.

So the solution is: don't use `--hard'.

-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 64+ messages in thread

[parent not found: <willow-jeske-01l5PFjPFEDjCfzf-01l5zrLdFEDjCV3U>]

* Re: why is git destructive by default? (i suggest it not be!)
       [not found]           ` <willow-jeske-01l5PFjPFEDjCfzf-01l5zrLdFEDjCV3U>
@ 2008-06-24 20:04             ` David Jeske
  2008-06-24 20:04             ` David Jeske
  1 sibling, 0 replies; 64+ messages in thread
From: David Jeske @ 2008-06-24 20:04 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Boaz Harrosh, git

> -- David Jeske wrote:
>> - improve the man page description of "reset --hard"
>> - standardize all the potentially destructive operations
>> (after gc) on "-f/--force" to override
>
> The thing is 'force' is not always the most descriptive word
> for the behavior that you propose enabling with --force.
I'm not talking about switching "git reset --hard" to "git reset -f". I'm
talking about requiring a "-f" option to "git reset --hard" when it would
destroy or dangle information.

(a) If you have a clean working directory which is fully checked in and has
another branch tag other than the current branch tag, then "git reset --hard
<commitish>" is non-destructive, and would complete happily.

(b) If you have local modifications to working files it would complain "hey,
your working files are dirty, 'reset --hard' will blow them away, either revert
them or use -f". This is what Boaz asked for, and I doubt it would change along
would alter workflow much for people who are using "git reset --hard" to toss
attempted patches (since they were fully committed anyhow), or even undo a
clone or pull operation. If people use it as a combo "revert and reset", they
would notice.

(c) If the current location is only pointed to by the current branch (which you
are going to move with 'reset --hard') tell the user that those changes will be
dangling and will be eligible for garbage collection if they move the branch.
What to do in this case seems more controversial. I would prefer for this to
error with "either label these changes with 'branch', or use 'reset --hard -f'
to force us to leave these in the reflog unnamed".  --- Some here say that
being in the reflog is enough, and the -f is overkill here. If we define
destructive as dropping code-commits, then that's true. If we define
destructive as leaving code-commits unreferenced, then -f is warranted.
Personally, I'd rather git help me avoid dropping the NAMES to tips, because
even with GC-never, I don't really want to find myself crawling through SHA1
hashes and visualization trees to find them later, when git could have reminded
me to name a branch that would conveniently show up in 'git branch'. It's easy
enough to avoid dropping the names, or force git to not care with '-f'. I
personally would like to avoid dealing with reflog or SHA1 hashes 99% of the
time.

> 'gc' is another command that has been mentioned along
> with its '--aggressive' option.

This was an accident. When I made my "mv --aggressive" joke I was NOT intending
to reference "gc --aggressive", that is just a coincidence. I was trying to
make up another 'semi-dangerous sounding name that might or not might be
destructive". It's comical that it's in use for gc. I don't see any
relationship between "gc --aggressive" and destructive behavior.

However, there IS a situation to require a "-f" on a, because again, "-f" would
be required for operations which destroy commits. If we think commits being in
the reflog is good enough to hold onto them, and users are thinking that items
being in the reflog are 'safe', then a GC where reflog entry expiration is
going to cause DAG entries to be removed could print an error like:

error: the following entries are beyond the expiration time,
...<base branchname>/<commit-ish>: 17 commits, 78 lines, 3 authors
...use diff <commit-ish> , to see the changes
...use gc -f, to cause them to be deleted

This wouldn't happen very often, and would make "gc" a safe operation even on
trees with shorter expiration time. In fact, if this were the way it worked, I
might set my GC back from never to "30 days", because this would not only allow
me to safely cleanup junk, but it would also allow me to catch unnamed and
dangling references before they became so old I didn't remember what to name
them.

This would make a "non forced gc" safe from throwing away commits, but still
make it really easy to do so for people who want to. Likewise, we could make
any "auto-gc" that happens not forced by default.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
       [not found]           ` <willow-jeske-01l5PFjPFEDjCfzf-01l5zrLdFEDjCV3U>
  2008-06-24 20:04             ` David Jeske
@ 2008-06-24 20:04             ` David Jeske
  2008-06-24 21:42               ` Brandon Casey
  2008-06-24 22:21               ` Steven Walter
  1 sibling, 2 replies; 64+ messages in thread
From: David Jeske @ 2008-06-24 20:04 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Boaz Harrosh, git

> -- David Jeske wrote:
>> - improve the man page description of "reset --hard"
>> - standardize all the potentially destructive operations
>> (after gc) on "-f/--force" to override
>
> The thing is 'force' is not always the most descriptive word
> for the behavior that you propose enabling with --force.
I'm not talking about switching "git reset --hard" to "git reset -f". I'm
talking about requiring a "-f" option to "git reset --hard" when it would
destroy or dangle information.

(a) If you have a clean working directory which is fully checked in and has
another branch tag other than the current branch tag, then "git reset --hard
<commitish>" is non-destructive, and would complete happily.

(b) If you have local modifications to working files it would complain "hey,
your working files are dirty, 'reset --hard' will blow them away, either revert
them or use -f". This is what Boaz asked for, and I doubt it would change along
would alter workflow much for people who are using "git reset --hard" to toss
attempted patches (since they were fully committed anyhow), or even undo a
clone or pull operation. If people use it as a combo "revert and reset", they
would notice.

(c) If the current location is only pointed to by the current branch (which you
are going to move with 'reset --hard') tell the user that those changes will be
dangling and will be eligible for garbage collection if they move the branch.
What to do in this case seems more controversial. I would prefer for this to
error with "either label these changes with 'branch', or use 'reset --hard -f'
to force us to leave these in the reflog unnamed".  --- Some here say that
being in the reflog is enough, and the -f is overkill here. If we define
destructive as dropping code-commits, then that's true. If we define
destructive as leaving code-commits unreferenced, then -f is warranted.
Personally, I'd rather git help me avoid dropping the NAMES to tips, because
even with GC-never, I don't really want to find myself crawling through SHA1
hashes and visualization trees to find them later, when git could have reminded
me to name a branch that would conveniently show up in 'git branch'. It's easy
enough to avoid dropping the names, or force git to not care with '-f'. I
personally would like to avoid dealing with reflog or SHA1 hashes 99% of the
time.

> 'gc' is another command that has been mentioned along
> with its '--aggressive' option.

This was an accident. When I made my "mv --aggressive" joke I was NOT intending
to reference "gc --aggressive", that is just a coincidence. I was trying to
make up another 'semi-dangerous sounding name that might or not might be
destructive". It's comical that it's in use for gc. I don't see any
relationship between "gc --aggressive" and destructive behavior.

However, there IS a situation to require a "-f" on a, because again, "-f" would
be required for operations which destroy commits. If we think commits being in
the reflog is good enough to hold onto them, and users are thinking that items
being in the reflog are 'safe', then a GC where reflog entry expiration is
going to cause DAG entries to be removed could print an error like:

error: the following entries are beyond the expiration time,
...<base branchname>/<commit-ish>: 17 commits, 78 lines, 3 authors
...use diff <commit-ish> , to see the changes
...use gc -f, to cause them to be deleted

This wouldn't happen very often, and would make "gc" a safe operation even on
trees with shorter expiration time. In fact, if this were the way it worked, I
might set my GC back from never to "30 days", because this would not only allow
me to safely cleanup junk, but it would also allow me to catch unnamed and
dangling references before they became so old I didn't remember what to name
them.

This would make a "non forced gc" safe from throwing away commits, but still
make it really easy to do so for people who want to. Likewise, we could make
any "auto-gc" that happens not forced by default.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
  2008-06-24 20:04             ` David Jeske
@ 2008-06-24 21:42               ` Brandon Casey
       [not found]                 ` <willow-jeske-01l5PFjPFEDjCfzf-01l63P33FEDjCVQ0>
  2008-06-24 22:54                 ` Theodore Tso
  2008-06-24 22:21               ` Steven Walter
  1 sibling, 2 replies; 64+ messages in thread
From: Brandon Casey @ 2008-06-24 21:42 UTC (permalink / raw)
  To: David Jeske; +Cc: Jakub Narebski, Boaz Harrosh, git

David Jeske wrote:
>> -- David Jeske wrote:
>>> - improve the man page description of "reset --hard"
>>> - standardize all the potentially destructive operations
>>> (after gc) on "-f/--force" to override
>> The thing is 'force' is not always the most descriptive word
>> for the behavior that you propose enabling with --force.
> I'm not talking about switching "git reset --hard" to "git reset -f". I'm
> talking about requiring a "-f" option to "git reset --hard" when it would
> destroy or dangle information.

I only have the same advice I gave to Boaz. I think you should try to adjust
your workflow so that 'git reset' is not necessary. It seems that for the
functions you're trying to perform, 'checkout' and 'branch' should be used rather
than 'reset'.

Again, as I mentioned to Boaz, there is really no benefit to reusing a single
branch name if that is what you are trying to do. The cost of branching in git
is 41 bytes i.e. nil. The cost of updating the working directory which happens
during the 'reset --hard' is exactly the same whether I do
'reset --hard <some_branch>' or 'checkout -b new_branch <some_branch>'.

In nearly every case where I, personally, have used 'reset --hard', I was using
it because I didn't care what the current state of the working directory or the
index were. They were wrong and I was resetting to the right state. I believe
this was the intended use for the command.

I'm not sure why you want to use reset so often. If there is something in the
documentation that led you to want to use reset maybe it can be changed so that
other users are not led in the same way.

About the reflog..
The reflog is not a storage area. It's just a log, like /var/log/messages. It is
there to provide a way to recover from mistakes. Mistakes are usually recognized
fairly quickly. If you have not realized that you have made a mistake after 30
days, it may be pretty hard to recover from since people have imperfect memories.
If we did not garbage collect the reflog it would just continue to grow appending
useless piece of information after useless piece of information.

-brandon

^ permalink raw reply	[flat|nested] 64+ messages in thread

[parent not found: <willow-jeske-01l5PFjPFEDjCfzf-01l63P33FEDjCVQ0>]

* Re: why is git destructive by default? (i suggest it not be!)
       [not found]                 ` <willow-jeske-01l5PFjPFEDjCfzf-01l63P33FEDjCVQ0>
@ 2008-06-24 22:13                   ` David Jeske
  2008-06-24 22:13                   ` David Jeske
  1 sibling, 0 replies; 64+ messages in thread
From: David Jeske @ 2008-06-24 22:13 UTC (permalink / raw)
  To: Brandon Casey; +Cc: Jakub Narebski, Boaz Harrosh, git

-- Brandon Casey wrote:
> I only have the same advice I gave to Boaz. I think you should try to adjust
> your workflow so that 'git reset' is not necessary. It seems that for the
> functions you're trying to perform, 'checkout' and 'branch' should be used
> rather than 'reset'.

Even when I change my workflow to avoid 'reset', I believe that the
user-interface of git will be stronger if it is a simpler expression of the
same functionality. One way to simplify it is to use convention that is
standardized across a set of tools so we don't have to learn every little
nuance of every little feature independently.

Two things I'd like to make it easy for users to never do are:
- delete data
- cause refs to be dangling

Therefore, I'd like a simple convention I can apply across all commands, so
that if users never do them, they'll never do either of the above things. I'm
not alone.

I think some of the impedance mismatch between my suggestions, and current
usage, has to do with where I'd like to be next. This is a meaty topic, I'll
start another thread on "policy and mechanism for less-connected clients".

> I'm not sure why you want to use reset so often. If there is something in the
> documentation that led you to want to use reset maybe it can be changed so
that
> other users are not led in the same way.

Yes, it's a problem in the git-gui and the "reset --hard" documentation. I'm
working on a patch.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
       [not found]                 ` <willow-jeske-01l5PFjPFEDjCfzf-01l63P33FEDjCVQ0>
  2008-06-24 22:13                   ` David Jeske
@ 2008-06-24 22:13                   ` David Jeske
  1 sibling, 0 replies; 64+ messages in thread
From: David Jeske @ 2008-06-24 22:13 UTC (permalink / raw)
  To: Brandon Casey; +Cc: Jakub Narebski, Boaz Harrosh, git

-- Brandon Casey wrote:
> I only have the same advice I gave to Boaz. I think you should try to adjust
> your workflow so that 'git reset' is not necessary. It seems that for the
> functions you're trying to perform, 'checkout' and 'branch' should be used
> rather than 'reset'.

Even when I change my workflow to avoid 'reset', I believe that the
user-interface of git will be stronger if it is a simpler expression of the
same functionality. One way to simplify it is to use convention that is
standardized across a set of tools so we don't have to learn every little
nuance of every little feature independently.

Two things I'd like to make it easy for users to never do are:
- delete data
- cause refs to be dangling

Therefore, I'd like a simple convention I can apply across all commands, so
that if users never do them, they'll never do either of the above things. I'm
not alone.

I think some of the impedance mismatch between my suggestions, and current
usage, has to do with where I'd like to be next. This is a meaty topic, I'll
start another thread on "policy and mechanism for less-connected clients".

> I'm not sure why you want to use reset so often. If there is something in the
> documentation that led you to want to use reset maybe it can be changed so
that
> other users are not led in the same way.

Yes, it's a problem in the git-gui and the "reset --hard" documentation. I'm
working on a patch.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
  2008-06-24 21:42               ` Brandon Casey
       [not found]                 ` <willow-jeske-01l5PFjPFEDjCfzf-01l63P33FEDjCVQ0>
@ 2008-06-24 22:54                 ` Theodore Tso
  2008-06-24 23:07                   ` Junio C Hamano
  1 sibling, 1 reply; 64+ messages in thread
From: Theodore Tso @ 2008-06-24 22:54 UTC (permalink / raw)
  To: Brandon Casey; +Cc: David Jeske, Jakub Narebski, Boaz Harrosh, git

On Tue, Jun 24, 2008 at 04:42:49PM -0500, Brandon Casey wrote:
> Again, as I mentioned to Boaz, there is really no benefit to reusing
> a single branch name if that is what you are trying to do. The cost
> of branching in git is 41 bytes i.e. nil.

The main reason that I find for reusing a branch name is for my
integration branch.  I have a script which basically does:

git checkout integration
git reset --hard origin
git merge branch-A
git merge branch-B
git merge branch-C
git merge branch-D

I suppose I could have avoided the use of git reset with something
like this:

git update-index --refresh --unmerged > /dev/null
if git diff-index --name-only HEAD | read dummy; then
	echo "There are local changes; refusing to build integration branch!"
	exit 1
fi
git update-ref refs/heads/integration origin
git checkout integration
git merge branch-A
git merge branch-B
git merge branch-C
git merge branch-D

Instead, I've just learned to be careful and my use of git reset
--hard is mainly for historical reasons.  But the point is, I can very
easily think of workflows where it makes sense to reuse a branch name,
most of them having to do with creating integration branches which are
basically throwaways after I am done testing or building that combined
tree.

							- Ted

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
  2008-06-24 22:54                 ` Theodore Tso
@ 2008-06-24 23:07                   ` Junio C Hamano
  2008-06-25  2:26                     ` Theodore Tso
  0 siblings, 1 reply; 64+ messages in thread
From: Junio C Hamano @ 2008-06-24 23:07 UTC (permalink / raw)
  To: Theodore Tso
  Cc: Brandon Casey, David Jeske, Jakub Narebski, Boaz Harrosh, git

Theodore Tso <tytso@mit.edu> writes:

> The main reason that I find for reusing a branch name is for my
> integration branch.  I have a script which basically does:
>
> git checkout integration
> git reset --hard origin
> git merge branch-A
> git merge branch-B
> git merge branch-C
> git merge branch-D
>
> I suppose I could have avoided the use of git reset with something
> like this:
>
> git update-index --refresh --unmerged > /dev/null
> if git diff-index --name-only HEAD | read dummy; then
> 	echo "There are local changes; refusing to build integration branch!"
> 	exit 1
> fi
> git update-ref refs/heads/integration origin
> git checkout integration
> git merge branch-A
> git merge branch-B
> git merge branch-C
> git merge branch-D
>
> Instead, I've just learned to be careful and my use of git reset
> --hard is mainly for historical reasons.

This makes it sound as if avoiding "reset --hard" is a good thing, but I
do not understand why.

The reason you have the diff-index check in the second sequence is because
update-ref does not have the "local changes" check either.  You could have
used the same diff-index check in front of "reset --hard".

Moreover, in your original sequence above, doesn't "git checkout
integration" list your local changes when you have any, and wouldn't that
be a clue enough that the next "reset --hard origin" would discard them?

> ...  But the point is, I can very
> easily think of workflows where it makes sense to reuse a branch name,
> most of them having to do with creating integration branches which are
> basically throwaways after I am done testing or building that combined
> tree.

Absolutely.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
  2008-06-24 23:07                   ` Junio C Hamano
@ 2008-06-25  2:26                     ` Theodore Tso
  2008-06-25  8:58                       ` Jakub Narebski
  2008-06-26 15:13                       ` Brandon Casey
  0 siblings, 2 replies; 64+ messages in thread
From: Theodore Tso @ 2008-06-25  2:26 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Brandon Casey, David Jeske, Jakub Narebski, Boaz Harrosh, git

On Tue, Jun 24, 2008 at 04:07:57PM -0700, Junio C Hamano wrote:
> > Instead, I've just learned to be careful and my use of git reset
> > --hard is mainly for historical reasons.
> 
> This makes it sound as if avoiding "reset --hard" is a good thing, but I
> do not understand why.

Well, it was Brandon Casey who was asserting that git reset --hard was
evil, which I generally don't agree with.  I do use workflows that use
it a fair amount, usually because its more convenient to type "git
checkout <foo>; git reset --hard <baz>" than something involving "git
update-ref refs/heads/<foo> <baz>".  The former has more characters
than the latter, and involves more disk I/O since it mutates the
working directory; but it's something about needing to type
"refs/heads/" such that I generally tend to type "git checkout....
git reset".  I can't explain why; maybe it's just psychological.

The reason why I've been thinking that I should change my shell script
from:

	git checkout integration
	git reset --hard <foo>

to:

	git update-ref ref/heads/integration HEAD
	git checkout integration

Is actually because the first tends to touch more files in the working
directory than the second (because if the integration branch is a week
or two old, the git checkout unwinds the global state by two weeks,
and then the git reset --hard has to bring the state back up to
recentcy; the second generally involves a smaller set of files
changing).  That's a very minor point, granted.

> The reason you have the diff-index check in the second sequence is because
> update-ref does not have the "local changes" check either.  You could have
> used the same diff-index check in front of "reset --hard".

Definitely true.  The reason why I don't have this check is because
I'm generally careful and I run a "git stat" to make sure there are no
local changes in the tree before I run the script.

> Moreover, in your original sequence above, doesn't "git checkout
> integration" list your local changes when you have any, and wouldn't that
> be a clue enough that the next "reset --hard origin" would discard them?

... because it's in a shell script; being fundamentally lazy, instead
of typing that sequence over and over again, I've scripted it.  :-)

	     	  	 	     	   - Ted

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
  2008-06-25  2:26                     ` Theodore Tso
@ 2008-06-25  8:58                       ` Jakub Narebski
  2008-06-25  9:14                         ` Junio C Hamano
  2008-06-26 15:13                       ` Brandon Casey
  1 sibling, 1 reply; 64+ messages in thread
From: Jakub Narebski @ 2008-06-25  8:58 UTC (permalink / raw)
  To: Theodore Tso
  Cc: Junio C Hamano, Brandon Casey, David Jeske, Boaz Harrosh, git

On Wed, 25 Jun 2008, Theodore Tso wrote:

> The reason why I've been thinking that I should change my shell script
> from:
> 
>         git checkout integration
>         git reset --hard <foo>
> 
> to:
> 
>         git update-ref ref/heads/integration HEAD
>         git checkout integration

Hmmmm.... Wouldn't it be easier on fingers to use

          git reset --soft integration

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
  2008-06-25  8:58                       ` Jakub Narebski
@ 2008-06-25  9:14                         ` Junio C Hamano
  0 siblings, 0 replies; 64+ messages in thread
From: Junio C Hamano @ 2008-06-25  9:14 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Theodore Tso, Brandon Casey, David Jeske, Boaz Harrosh, git

Jakub Narebski <jnareb@gmail.com> writes:

> On Wed, 25 Jun 2008, Theodore Tso wrote:
>
>> The reason why I've been thinking that I should change my shell script
>> from:
>> 
>>         git checkout integration
>>         git reset --hard <foo>
>> 
>> to:
>> 
>>         git update-ref ref/heads/integration HEAD
>>         git checkout integration
>
> Hmmmm.... Wouldn't it be easier on fingers to use
>
>           git reset --soft integration

That does not do anything close to what Ted is doing, does it?

Anyway, here is how I conclude my git day:

	git checkout next
        ... merge more and test
        ... be happy that next is in very good shape ;-)
        git branch -f pu
        git checkout pu
        git merge ... merge other topics to rebuild pu
        git merge ...
        ...

which is probably a bit less error prone then update-ref, if you type from
the command line like I do.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
  2008-06-25  2:26                     ` Theodore Tso
  2008-06-25  8:58                       ` Jakub Narebski
@ 2008-06-26 15:13                       ` Brandon Casey
  1 sibling, 0 replies; 64+ messages in thread
From: Brandon Casey @ 2008-06-26 15:13 UTC (permalink / raw)
  To: Theodore Tso
  Cc: Junio C Hamano, David Jeske, Jakub Narebski, Boaz Harrosh, git

Theodore Tso wrote:
> On Tue, Jun 24, 2008 at 04:07:57PM -0700, Junio C Hamano wrote:
>>> Instead, I've just learned to be careful and my use of git reset
>>> --hard is mainly for historical reasons.
>> This makes it sound as if avoiding "reset --hard" is a good thing, but I
>> do not understand why.
> 
> Well, it was Brandon Casey who was asserting that git reset --hard was
> evil, which I generally don't agree with.

I definitely don't think 'reset --hard' is evil. I _do_ think it is somewhat
of an advanced command. It should be used where it is appropriate. I think
it is a misuse of the command if it is used in place of checkout, which I got
the impression might be the case.

You described resetting an integration branch, Junio does a similar thing
with pu and these are both valid uses. This is what I was talking about
when I said that usually when I use reset I don't care about the state of
the branch I am resetting. I also agree there are many other valid uses for
'git reset --hard'.

-brandon

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
  2008-06-24 20:04             ` David Jeske
  2008-06-24 21:42               ` Brandon Casey
@ 2008-06-24 22:21               ` Steven Walter
  1 sibling, 0 replies; 64+ messages in thread
From: Steven Walter @ 2008-06-24 22:21 UTC (permalink / raw)
  To: David Jeske; +Cc: Jakub Narebski, Boaz Harrosh, git

On Tue, Jun 24, 2008 at 08:04:30PM -0000, David Jeske wrote:
> I'm not talking about switching "git reset --hard" to "git reset -f". I'm
> talking about requiring a "-f" option to "git reset --hard" when it would
> destroy or dangle information.

I think you're asking for something like the following...
-- 
-Steven Walter <stevenrwalter@gmail.com>
Freedom is the freedom to say that 2 + 2 = 4
B2F1 0ECC E605 7321 E818  7A65 FC81 9777 DC28 9E8F 

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
  2008-06-24 19:08         ` Jakub Narebski
       [not found]           ` <willow-jeske-01l5PFjPFEDjCfzf-01l5zrLdFEDjCV3U>
@ 2008-06-25  8:57           ` Boaz Harrosh
  1 sibling, 0 replies; 64+ messages in thread
From: Boaz Harrosh @ 2008-06-25  8:57 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: David Jeske, git, Brandon Casey, Theodore Tso, Junio C Hamano

Jakub Narebski wrote:
> Boaz Harrosh <bharrosh@panasas.com> writes:
> 
> 
>> Sorry
>> git-reset --clean -f/-n for removing local changes
>> git reset --hard for moving HEAD on a clean tree only
> 
> Wouldn't "git reset <commit-ish>" be enough then?  It modifies where
> current branch points to (as opposed to git-checkout modifying what is
> the current branch), and it modifies index.  What it doesn't modify is
> working directory, but it is clean already.
> 

Does not work. only --hard will do the job. The working directory is not
touched and if you'll do a git-diff you'll see the diff between old-head
to new-head. But what I want is to start-hack or merge on new-head.

> So the solution is: don't use `--hard'.
> 

the closest to git reset --hard that I can think of is:

Lets say I have
$ git-branch -a
* mybranch
remote/master

I can
$ git reset --hard remote/master
Or I can
$ git-checkout -b temp_mybranch remote/master
$ git-branch -M temp_mybranch mybranch

The second will complain if I have local changes.
I have just written 2 scripts. One "git-reset" that
will filter out --hard before calling the original.
Second "git-reset--hard" that will do the above.

Stupid me no more. It will not happen to me again.
Just those poor new users out there, I guess you have to
fall off your bike at least once.

Boaz

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
  2008-06-24 17:11     ` Boaz Harrosh
  2008-06-24 17:19       ` Boaz Harrosh
@ 2008-06-24 18:18       ` Brandon Casey
  1 sibling, 0 replies; 64+ messages in thread
From: Brandon Casey @ 2008-06-24 18:18 UTC (permalink / raw)
  To: Boaz Harrosh; +Cc: David Jeske, git

Boaz Harrosh wrote:

> I use git reset --hard in to separate and distinct functions.
> One - to move current branch head around from place to place.

Why?

> Two - Throw away work I've edited

This is valid.

> It has happened to me more then once that I wanted the first
> and also got the second as an un-warned bonus, to the dismay 
> of my bosses.

Why are you using 'git reset' to do this? Why not just checkout
the branch? I think you are using 'reset' in ways it is not
intended to be used. Is there something in the documentation that
led you to believe that 'reset --hard' should be used to switch
branches? I do see an example of such a thing in everyday.txt.
It deals with setting 'pu' branch to the tip of the 'next' branch,
but the 'pu' branch has a special meaning in git.

It seems like you are using 'reset' when you should be using 'checkout'.

For example:

$ git branch
* mybranch
  master
  next
  maint
  pu

If I have 'mybranch' checked out and I want to make a change on top of
the 'next' branch, I wouldn't do 'git reset --hard next', I would either
'git checkout next' or 'git checkout -b next-feature next' or something
similar.

If I've already merged the changes from mybranch back into upstream, then
it's safe to delete it.

I recommend adopting a branch naming scheme where the branch name describes
the task that is to be accomplished. i.e. 'foo' is a bad branch name.

btw, you are not saving anything by trying to reuse branch names. All
a branch is, is a file with a 40 byte string and a newline. So creating
a branch entails writing 41 bytes to a file. Deleting a branch entails
deleting a single file that is only 41 bytes small.

I suggest trying to adjust your work flow so that 'reset --hard' is not necessary.

-brandon

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
       [not found] ` <willow-jeske-01l5PFjPFEDjCfzf-01l5V7wbFEDjCX7V>
  2008-06-24  1:47   ` David Jeske
@ 2008-06-24  1:47   ` David Jeske
  1 sibling, 0 replies; 64+ messages in thread
From: David Jeske @ 2008-06-24  1:47 UTC (permalink / raw)
  To: git

As a new user, I'm finding git difficult to trust, because there are operations
which are destructive by default and capable of inadvertently throwing hours or
days of work into the bit bucket.

More problematic, those commands have no discernible pattern that shows their
danger, and they need to be used to do typical everyday things. I'm starting to
feel like I need to use another source control system on top of the git
repository in case I make a mistake.  My philosophy is simple, I never never
never want to throw away changes, you shouldn't either. Disks are cheaper than
programmer hours. I can understand wanting to keep things tidy, so I can
understand ways to correct the 'easily visible changes', and also avoid pushing
them to other trees, but I don't understand why git needs to delete things.

For example, the following commands seem capable of totally destroying hours or
days of work. Some of them need to be used regularly to do everyday things, and
there is no pattern among them spelling out danger.

git reset --hard          : if another branch name hasn't been created
git rebase
git branch -D <branch>    : if branch hasn't been merged
git branch -f <new>       : if new exists and hasn't been merged
git branch -m <old> <new> : if new exists and hasn't been merged

I've heard from a couple users that the solution to these problems is to "go
dig what you need out of the log, it's still in there". However, it's only in
there until the log is garbage collected. This either means they are
destructive operations, or we expect "running without ever collecting the log"
to be a valid mode of operation... which I doubt is the case.

Question: How about assuring ALL operations can be done non-destructivly by
default? Then make destructive things require an explicit action that follows a
common pattern.

Suggestion Illustration
-----------------------

Below is one illustration of how these commands could be changed to be entirely
non-destructive, while retaining the current functionality. It also allows you
to destroy stuff if you have lawyers breathing down your neck, or really really
can't afford the hard drive space for a couple lines of text (though I'll
personally make a donation to anyone in this state!) :)

1) Require the "--destroy" flag for ANY git operation which is capable of
destroying data such that it is unrecoverable. A narrow view of this is to only
consider checked-in repository data, and not metadata, such as the location of
a branchname. However, the broad view would be to include all/most metadata.

2) Make a pattern for branch names which are kept in the local tree, not
included in push/pull, not modifiable without first renaming, and not shown by
default when viewing all branch history. For example, "local-<date>-*"

3) make 'git reset --hard <commit>' safe

Automatically commit working set and make a branch name (if necessary) to avoid
changes being thrown away. The branch name could be of the form
"local-<date>-reset-<user>-<date>". If the user really wants to destroy it,
they could use the dangerous version "git reset --hard --destroy", or they
could just "git branch -d --destroy <branchname>" afterwords. Most users would
do neither.

4) make 'git rebase' safe

'rebase' would make a branch name before performing its operation, assuring it
was easy to get back to the previous state. Currently, "git rebase" turns this:

A---B---C topic
/
D---E---F---G master

Into this:

A'--B'--C' topic
/
D---E---F---G master

.. and in turn destroys the original changes. It would instead create this:

A--B--C (x)    A'--B'--C' (y)
/              /
D---E------F-------G master

(x) - local-<date>-rebase-topic-<commit for G>
(y) - topic

5) make 'git branch' follow rule 1 above (safe without --destroy)

Using any of the following commands without --destroy would cause them to
create a branch "local-<date>-rename-<old branch name>", to prevet the
destruction of the old branch location:

git branch -d <branchname>
git branch -M <old> <new>
git branch -f <branchname>

^ permalink raw reply	[flat|nested] 64+ messages in thread

[parent not found: <willow-jeske-01l5PFjPFEDjCfzf-01l5V7wbFEDjCX7V@videotron.ca>]

[parent not found: <willow-jeske-01l5cKsCFEDjC=91MX@videotron.ca>]

* Re: why is git destructive by default? (i suggest it not be!)
       [not found]   ` <willow-jeske-01l5cKsCFEDjC=91MX@videotron.ca>
@ 2008-06-24  2:17     ` Nicolas Pitre
       [not found]       ` <willow-jeske-01l5PFjPFEDjCfzf-01l5ciVtFEDjCaD9>
       [not found]       ` <willow-jeske-01l5PFjPFEDjCfzf-01l5ciVtFEDjCaD9@videotron.ca>
  0 siblings, 2 replies; 64+ messages in thread
From: Nicolas Pitre @ 2008-06-24  2:17 UTC (permalink / raw)
  To: David Jeske; +Cc: git

On Tue, 24 Jun 2008, David Jeske wrote:

> I've heard from a couple users that the solution to these problems is to "go
> dig what you need out of the log, it's still in there". However, it's only in
> there until the log is garbage collected. This either means they are
> destructive operations, or we expect "running without ever collecting the log"
> to be a valid mode of operation... which I doubt is the case.

Why not?

> Question: How about assuring ALL operations can be done non-destructivly by
> default?

	git config --global gc.reflogexpire "2 years"


Nicolas

^ permalink raw reply	[flat|nested] 64+ messages in thread

[parent not found: <willow-jeske-01l5PFjPFEDjCfzf-01l5ciVtFEDjCaD9>]

* Re: why is git destructive by default? (i suggest it not be!)
       [not found]       ` <willow-jeske-01l5PFjPFEDjCfzf-01l5ciVtFEDjCaD9>
@ 2008-06-24  3:18         ` David Jeske
  2008-06-24  8:14           ` Lea Wiemann
  2008-06-24  3:18         ` David Jeske
  1 sibling, 1 reply; 64+ messages in thread
From: David Jeske @ 2008-06-24  3:18 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git

-- Nicolas Pitre wrote:
>> or we expect "running without ever collecting the log"
>> to be a valid mode of operation... which I doubt is the case.
>
> Why not?

Is see the hole I left in my logic, so let me restate.

... or we expect "human parsing of the the log" is a valid common
user-interface for non-git developers.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
  2008-06-24  3:18         ` David Jeske
@ 2008-06-24  8:14           ` Lea Wiemann
  0 siblings, 0 replies; 64+ messages in thread
From: Lea Wiemann @ 2008-06-24  8:14 UTC (permalink / raw)
  To: David Jeske; +Cc: Nicolas Pitre, git

David Jeske wrote:
> ... or we expect "human parsing of the the log" is a valid common
> user-interface for non-git developers.

As a side note, the reflog is not only a valid user interface, but an 
important one: As a local developer that feeds patches to the mailing 
list, I frequently change the history in my local repository (using 
rebase, reset and am, or pull --rebase) to keep the commits clean when 
they finally get merged upstream.  I *want* and *need* at least basic 
versioning for the various states my history is in.

IOW, I not only make changes to the tree and commit them to my master 
branch, but I also make changes to my master branch and "commit" them to 
(store them in) the reflog.

That's not an interesting use case if you're working on a branch that 
other people pull from, but for a local clone it's very useful.  (And 
it's a feature I haven't seen in any VCSes, FWIW.)

Best,

     Lea

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
       [not found]       ` <willow-jeske-01l5PFjPFEDjCfzf-01l5ciVtFEDjCaD9>
  2008-06-24  3:18         ` David Jeske
@ 2008-06-24  3:18         ` David Jeske
  1 sibling, 0 replies; 64+ messages in thread
From: David Jeske @ 2008-06-24  3:18 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git

-- Nicolas Pitre wrote:
>> or we expect "running without ever collecting the log"
>> to be a valid mode of operation... which I doubt is the case.
>
> Why not?

Is see the hole I left in my logic, so let me restate.

... or we expect "human parsing of the the log" is a valid common
user-interface for non-git developers.

^ permalink raw reply	[flat|nested] 64+ messages in thread

[parent not found: <willow-jeske-01l5PFjPFEDjCfzf-01l5ciVtFEDjCaD9@videotron.ca>]

[parent not found: <willow-jeske-01l5e9cgFEDjCh3F@videotron.ca>]

* Re: why is git destructive by default? (i suggest it not be!)
       [not found]         ` <willow-jeske-01l5e9cgFEDjCh3F@videotron.ca>
@ 2008-06-24  4:03           ` Nicolas Pitre
       [not found]             ` <willow-jeske-01l5PFjPFEDjCfzf-01l5fAcTFEDjCWA4>
       [not found]             ` <1978205964779154253@unknownmsgid>
  0 siblings, 2 replies; 64+ messages in thread
From: Nicolas Pitre @ 2008-06-24  4:03 UTC (permalink / raw)
  To: David Jeske; +Cc: git

On Tue, 24 Jun 2008, David Jeske wrote:

> -- Nicolas Pitre wrote:
> >> or we expect "running without ever collecting the log"
> >> to be a valid mode of operation... which I doubt is the case.
> >
> > Why not?
> 
> Is see the hole I left in my logic, so let me restate.
> 
> ... or we expect "human parsing of the the log" is a valid common
> user-interface for non-git developers.

The reflog is one of the primary user interface for all git users. 
Please just try:

	git reflog

and see for yourself.

And if you want more details, then just try:

	git log -g

You may even try any combination of flags in addition to -g with
'git log'.

I hope you'll feel much safer then.


Nicolas

^ permalink raw reply	[flat|nested] 64+ messages in thread

[parent not found: <willow-jeske-01l5PFjPFEDjCfzf-01l5fAcTFEDjCWA4>]

* Re: why is git destructive by default? (i suggest it not be!)
       [not found]             ` <willow-jeske-01l5PFjPFEDjCfzf-01l5fAcTFEDjCWA4>
@ 2008-06-24  4:40               ` David Jeske
  2008-06-24  4:40               ` David Jeske
  1 sibling, 0 replies; 64+ messages in thread
From: David Jeske @ 2008-06-24  4:40 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git

-- Nicolas Pitre wrote:
> I hope you'll feel much safer then.

I moved a branch around and then deleted it, and I don't see any record in the
reflog of where it was, or that it ever was.

Am I missing something about how branches are used? I see some language in "git
tag" about how attempts are made to assure that others can't move around
semi-immutable tags during push, but I don't see any such language about
branches. What prevents someone from accidentally deleting an old branch that
nobody is watching, but is important to the history and then not noticing as gc
silently deletes the old deltas?

I've had need to pull out versions several years old multiple times in my
career, so this is the kind of thing I'm thinking about.

git config --global gc.reflogexpire            "10 years"'
git config --global gc.reflogexpireunreachable "10 years"

Makes me feel safer that the data will be in there, but even with the reflog
and access to the repository, I doubt I could FIND the place an old branch was
supposed to be if it was inadvertently deleted in a 2-million line source tree.
Am I just looking in the wrong places?

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
       [not found]             ` <willow-jeske-01l5PFjPFEDjCfzf-01l5fAcTFEDjCWA4>
  2008-06-24  4:40               ` David Jeske
@ 2008-06-24  4:40               ` David Jeske
  2008-06-24  5:24                 ` Jan Krüger
  1 sibling, 1 reply; 64+ messages in thread
From: David Jeske @ 2008-06-24  4:40 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git

-- Nicolas Pitre wrote:
> I hope you'll feel much safer then.

I moved a branch around and then deleted it, and I don't see any record in the
reflog of where it was, or that it ever was.

Am I missing something about how branches are used? I see some language in "git
tag" about how attempts are made to assure that others can't move around
semi-immutable tags during push, but I don't see any such language about
branches. What prevents someone from accidentally deleting an old branch that
nobody is watching, but is important to the history and then not noticing as gc
silently deletes the old deltas?

I've had need to pull out versions several years old multiple times in my
career, so this is the kind of thing I'm thinking about.

git config --global gc.reflogexpire            "10 years"'
git config --global gc.reflogexpireunreachable "10 years"

Makes me feel safer that the data will be in there, but even with the reflog
and access to the repository, I doubt I could FIND the place an old branch was
supposed to be if it was inadvertently deleted in a 2-million line source tree.
Am I just looking in the wrong places?

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
  2008-06-24  4:40               ` David Jeske
@ 2008-06-24  5:24                 ` Jan Krüger
  0 siblings, 0 replies; 64+ messages in thread
From: Jan Krüger @ 2008-06-24  5:24 UTC (permalink / raw)
  To: David Jeske; +Cc: git

Hi David,

"David Jeske" <jeske@google.com> wrote:
> I moved a branch around and then deleted it, and I don't see any
> record in the reflog of where it was, or that it ever was.

If a branch you're trying to delete is not part (or, more correctly,
an ancestor) of your current branch, you'll get a warning that you have
to explicitly bypass by using -D rather than -d.

Still, after deleting the branch, its old tip will very likely show up
in the reflog for HEAD (at the point you last worked on the branch),
even if the branch name won't show up anywhere. After locating the
commit in there it's a simple case of git checkout -b whatever
HEAD@{123} to get back that branch.

> What prevents someone from accidentally deleting an old branch that
> nobody is watching, but is important to the history and then not
> noticing as gc silently deletes the old deltas?

One thing to keep in mind is that deleting your branch locally won't
rid you of remote copies of it, so anything that you considered worth
sharing will probably survive even if you accidentally bypassed Git's
warning about deleting branches.

Best,
Jan

^ permalink raw reply	[flat|nested] 64+ messages in thread

[parent not found: <1978205964779154253@unknownmsgid>]

* Re: why is git destructive by default? (i suggest it not be!)
       [not found]             ` <1978205964779154253@unknownmsgid>
@ 2008-06-24  5:20               ` Avery Pennarun
       [not found]                 ` <willow-jeske-01l5PFjPFEDjCfzf-01l5gtQ7FEDjCWCC>
  2008-06-24  7:54                 ` Jakub Narebski
  0 siblings, 2 replies; 64+ messages in thread
From: Avery Pennarun @ 2008-06-24  5:20 UTC (permalink / raw)
  To: David Jeske; +Cc: Nicolas Pitre, git

On 6/24/08, David Jeske <jeske@google.com> wrote:
> I moved a branch around and then deleted it, and I don't see any record in the
>  reflog of where it was, or that it ever was.
>
>  Am I missing something about how branches are used? I see some language in "git
>  tag" about how attempts are made to assure that others can't move around
>  semi-immutable tags during push, but I don't see any such language about
>  branches. What prevents someone from accidentally deleting an old branch that
>  nobody is watching, but is important to the history and then not noticing as gc
>  silently deletes the old deltas?
>
>  I've had need to pull out versions several years old multiple times in my
>  career, so this is the kind of thing I'm thinking about.

git branches are actually a very different concept from branches in,
say, subversion.

In subversion, a branch is normally created so that you can do
parallel development, and then you merge whole batches of changes
(with 'svn merge') from one branch into another.  When you do this,
you create a single new commit in the destination branch that contains
*all* the changes.  So if you want to look back in history to see who
did which part of the change for what reason, you have to go back to
the branch you merged *from*.  Thus, it's very important in subversion
that old branches never disappear.

git's philosophy is different.  Branches are really just "temporary
tags".  A merge operation doesn't just copy data from one branch to
another: it actually joins the two histories together, so you can then
trace back through the exact history of the merged branches, commit by
commit.  "git log" will show each checkin to *either* branch
individually, instead of just one big "merge" checkin.

The end result is that even if you delete the source branch after
doing a merge, nothing is actually lost.  Thus, there's no reason for
git to try to make branches impossible to lose, as they are in svn.
In the event that you really needed that branch pointer, it's in the
reflog, as a few people have pointed out.

Another way to think of it is that svn's concept of a "branch" is
actually the "reflog" in git.  (svn records which data a particular
branch name points to over time, just like git's reflog does.)  git
branches are something else entirely; a git branch always points at
only a single commit, and has no history of its own.

Does that help?  Perhaps it only confuses the issue :)

Have fun,

Avery

^ permalink raw reply	[flat|nested] 64+ messages in thread

[parent not found: <willow-jeske-01l5PFjPFEDjCfzf-01l5gtQ7FEDjCWCC>]

* Re: why is git destructive by default? (i suggest it not be!)
       [not found]                 ` <willow-jeske-01l5PFjPFEDjCfzf-01l5gtQ7FEDjCWCC>
@ 2008-06-24  6:35                   ` David Jeske
  2008-06-24  7:24                     ` Jeff King
  2008-06-24  6:35                   ` David Jeske
  1 sibling, 1 reply; 64+ messages in thread
From: David Jeske @ 2008-06-24  6:35 UTC (permalink / raw)
  To: Avery Pennarun; +Cc: Nicolas Pitre, git

Thanks for all the helpful responses...

-- Avery Pennarun wrote:
> git's philosophy is different. Branches are really just "temporary
> tags". A merge operation doesn't just copy data from one branch to
> another: it actually joins the two histories together, so you can then
> trace back through the exact history of the merged branches, commit by
> commit. "git log" will show each checkin to *either* branch
> individually, instead of just one big "merge" checkin.

If branches are "temporary tags" how do I see the actual code they had working
in their branch before they merged it?

I'm reading about rebase, and it sounds like something I would want to forever
disallow on my git repository, because it looks like it rewrites history and
makes it impossible to get to the state of the tree they actually had working
before the merge. However, something you say below both clarifies and confuses
this.

Am I understanding this wrong?

> The end result is that even if you delete the source branch after
> doing a merge, nothing is actually lost.

..and what if you never merge? That branch-pointer points to useful information
about a development attempt, but it was never merged. (imagine a different
development path was taken) They never created a tag because it's not clear
when that work was "done" (unlike a release, which is much more well
understood). What prevents someone from deleting the branch-pointer or moving
it to a different part of the tree, causing that set of changes to be a
dangling ref lost in a sea of refs. Later when someone goes back looking for
it, how would they ever find it in a sea of tens of thousands of checkins?

> Thus, there's no reason for git to try to make branches impossible
> to lose, as they are in svn.

Before I set the GC times to "100 years", there was a HUGE reason for git to
make those branch-pointers impossible to lose, because by default if you lose
them git actually garbage collects them and throws the diffs away after 90
days!

> Another way to think of it is that svn's concept of a "branch" is
> actually the "reflog" in git. (svn records which data a particular
> branch name points to over time, just like git's reflog does.) git
> branches are something else entirely; a git branch always points at
> only a single commit, and has no history of its own.

That's sort of helpful, and sort of confusing. I think of git's branches as
"branch pointers to the head of a linked-list of states of the tree".

As long as you keep those refs without deleting them, and you keep that branch
pointer to the head, you can walk back through the history of that branch. If
multiple developers are working in the branch (and not using rebase, and not
garbage collecting), can't you even go track down the working state of their
local clients while they were working before they merged?

If I'm understanding all that right, it's exactly the kind of functionality I
want -- the ability to reproduce the state of all working history, exactly as
it was when the code was actually working in someone's client a long time ago,
before they merged it to the mainline. Except the standard model seems to be to
let the system "garbage collect" all that history, and toss it away as
unimportant -- and in some cases it seems to even provide developers with ways
to more aggressively assure garbage collection makes it disappear.

Am I expecting too much out of git? It doesn't really feel like a source
control system for an organization that wants to save everything, forever, even
when those people and trees and home directories disappear. It feels like a
distributed patch manager that is much more automatic than sending around
diffs, but isn't overly concerned with providing access to old history. (which,
duh, is no surprise given that's what I expect it's doing for linux kernel)

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
  2008-06-24  6:35                   ` David Jeske
@ 2008-06-24  7:24                     ` Jeff King
       [not found]                       ` <willow-jeske-01l5PFjPFEDjCfzf-01l5jmMuFEDjChvB>
  0 siblings, 1 reply; 64+ messages in thread
From: Jeff King @ 2008-06-24  7:24 UTC (permalink / raw)
  To: David Jeske; +Cc: Avery Pennarun, Nicolas Pitre, git

On Tue, Jun 24, 2008 at 06:35:16AM -0000, David Jeske wrote:

> If branches are "temporary tags" how do I see the actual code they had
> working in their branch before they merged it?

You look at the shape of the history. But if it is really an important
event for you to say "this was the state right before some merge of
interest", then by all means, tag it with a real tag. Or don't delete
the branch.

Have you tried running gitk on the kernel or git repositories?

> I'm reading about rebase, and it sounds like something I would want to
> forever disallow on my git repository, because it looks like it
> rewrites history and makes it impossible to get to the state of the
> tree they actually had working before the merge. However, something
> you say below both clarifies and confuses this.

It does throw away the state before the rebase (well, there is no longer
a pointer to it; it is still recoverable via the reflog). But for most
push/pull collaboration, you probably want to be using merge. Rebase is
more useful for people who are more accustomed to a patch-based
workflow.

> > The end result is that even if you delete the source branch after
> > doing a merge, nothing is actually lost.
> 
> ..and what if you never merge? That branch-pointer points to useful
> information about a development attempt, but it was never merged.
> (imagine a different development path was taken) They never created a
> tag because it's not clear when that work was "done" (unlike a
> release, which is much more well understood). What prevents someone
> from deleting the branch-pointer or moving it to a different part of
> the tree, causing that set of changes to be a dangling ref lost in a
> sea of refs. Later when someone goes back looking for it, how would
> they ever find it in a sea of tens of thousands of checkins?

If it's not merged, then don't delete the branch pointer! And "git
branch -d" will even refuse to do the deletion, unless you force it with
"git branch -D".

And keep in mind that when you clone repos, you clone the branch
pointer. So if you have a centralized server that your developers push
and pull from, a stray "git branch -D" from one developer _doesn't_ ruin
it for the rest of them. All that does is delete the branch from their
local repo, but it still exists in the central repo and for all of the
other developers. But it's not clear to me what sort of developer
topology you're interested in.

> Before I set the GC times to "100 years", there was a HUGE reason for git to
> make those branch-pointers impossible to lose, because by default if you lose
> them git actually garbage collects them and throws the diffs away after 90
> days!

I think most people are comfortable with "if I have an unmerged branch,
it stays forever. If I accidentally delete my branch, I have 30 days to
pull the tip out of my reflog". Sure, it's _possible_ to lose work. But
you could also accidentally "rm -rf" your .git directory. If you want an
extra layer of protection, push your work periodically to a backup repo.

> That's sort of helpful, and sort of confusing. I think of git's branches as
> "branch pointers to the head of a linked-list of states of the tree".

More or less true (they aren't linked-list, but arbitrary DAGs --
commits can have more than one parent (i.e., a merge) and can have many
children (i.e., many people build off in different directions from one
spot)).

> If I'm understanding all that right, it's exactly the kind of
> functionality I want -- the ability to reproduce the state of all
> working history, exactly as it was when the code was actually working
> in someone's client a long time ago, before they merged it to the
> mainline. Except the standard model seems to be to let the system
> "garbage collect" all that history, and toss it away as unimportant --
> and in some cases it seems to even provide developers with ways to
> more aggressively assure garbage collection makes it disappear.

I think you are confusing two aspects of history.

There is the commit DAG, which says "at some time T, the files were at
some state S, and the commit message by author A was M". And those
commits form a chain so you can see how the state of the files
progressed. And anything that is reachable through that history will
always be kept by git, and you can always go back to any point.

But we also give particular names to some points, like "this is tag
v1.0" or "this is the head of the experimental line of development". We
call those refs.  Git remembers those names until you ask it not to (by
deleting the ref).  And there is a history to those names, like
"experimental was at some commit C1. Then somebody committed and it was
at C2. And then they did a git-reset and it was at C3". And that history
is encapsulated in the reflog, and is purely local to each repository
(since git is distributed, it makes no sense to talk about "where the
experimental name pointed" without talking about a specific repo).

And the ref history is what gets garbage collected. Most people are fine
with that, because they care about the actual commit history, and the
reflog is just a convenient way of saying "oops, what was happening
yesterday?" But if you really care, then by all means, set the reflog
expiration much higher.

> Am I expecting too much out of git? It doesn't really feel like a
> source control system for an organization that wants to save
> everything, forever, even when those people and trees and home
> directories disappear. It feels like a distributed patch manager that
> is much more automatic than sending around diffs, but isn't overly
> concerned with providing access to old history. (which, duh, is no
> surprise given that's what I expect it's doing for linux kernel)

Git _will_ remember content forever, _if_ you put into git. So if you
are saying "git won't remember work that employee X did after he is
gone", that isn't true. X's work will be part of the commit DAG and will
be a part of everybody's repo. If you are saying "I blew away employee
X's home directory, and he had a git repo in it, why didn't git save
that data?" then the problem is that you deleted the repo! If you are
concerned about that situation, have employee X push his work to a repo
that doesn't get deleted.

-Peff

^ permalink raw reply	[flat|nested] 64+ messages in thread

[parent not found: <willow-jeske-01l5PFjPFEDjCfzf-01l5jmMuFEDjChvB>]

* Re: why is git destructive by default? (i suggest it not be!)
       [not found]                       ` <willow-jeske-01l5PFjPFEDjCfzf-01l5jmMuFEDjChvB>
@ 2008-06-24  7:31                         ` David Jeske
  2008-06-24  8:16                           ` Jeff King
  2008-06-24  7:31                         ` David Jeske
  1 sibling, 1 reply; 64+ messages in thread
From: David Jeske @ 2008-06-24  7:31 UTC (permalink / raw)
  To: Jeff King; +Cc: Avery Pennarun, Nicolas Pitre, git

-- Jeff King wrote:
> I think you are confusing two aspects of history.
>
> There is the commit DAG, which says "at some time T, the files were at
> some state S, and the commit message by author A was M". And those
> commits form a chain so you can see how the state of the files
> progressed. And anything that is reachable through that history will

okay.

> always be kept by git, and you can always go back to any point.

..are you saying that if I reset --hard, or delete a branch ref, or do a
rebase, and then do a GC beyond the GC timeout, that git will NEVER throw away
any of those DAGs? (the actual source diffs committed)

> And the ref history is what gets garbage collected. Most people are fine
> with that, because they care about the actual commit history, and the
> reflog is just a convenient way of saying "oops, what was happening
> yesterday?" But if you really care, then by all means, set the reflog
> expiration much higher.

My (possibly flawed) understanding was that it drops any DAG sections that are
not referenced by valid refs which are older than the GC timeout.

It came from wording like this in the docs:

"The optional configuration variable gc.reflogExpireUnreachable
can be set to indicate how long historical reflog entries which
are not part of the current branch should remain available in
this repository. These types of entries are generally created
as a result of using git commit --amend or git rebase and are the
commits prior to the amend or rebase occurring. Since
these changes are not part of the current project most users
^^^^^^^^^^^^^
will want to expire them sooner. This option defaults to 30 days."

In the above, I resolve "these changes" to "commits prior to the amend" in the
previous sentence.

"git-gc tries very hard to be safe about the garbage it collects.
In particular, it will keep not only objects referenced by your
current set of branches and tags, but also objects referenced by
the index, remote tracking branches, refs saved by
git-filter-branch(1) in refs/original/, or reflogs (which may
references commits in branches that were later amended or rewound)."

In the above, I resolve "keep .. only objects referenced by your current set of
branches and tags [and some other stuff]" to "commmits in the DAG pointed to by
refs [and other stuff]".
Are you saying this GC process will never collect source diffs in the DAG?

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
  2008-06-24  7:31                         ` David Jeske
@ 2008-06-24  8:16                           ` Jeff King
       [not found]                             ` <willow-jeske-01l5PFjPFEDjCfzf-01l5kv6TFEDjCj8S>
       [not found]                             ` <willow-jeske-01l5PFjPFEDjCfzf-01l5kv6TFEDjCj8S@brm-avmta-1.central.sun.com>
  0 siblings, 2 replies; 64+ messages in thread
From: Jeff King @ 2008-06-24  8:16 UTC (permalink / raw)
  To: David Jeske; +Cc: Avery Pennarun, Nicolas Pitre, git

On Tue, Jun 24, 2008 at 07:31:31AM -0000, David Jeske wrote:

> ..are you saying that if I reset --hard, or delete a branch ref, or do a
> rebase, and then do a GC beyond the GC timeout, that git will NEVER throw away
> any of those DAGs? (the actual source diffs committed)

No. Git keeps the reachable DAG. So if the DAG is part of development
that is merged into one of your long running branches, or if you keep
around the branch that points to it, it will never go away.

> My (possibly flawed) understanding was that it drops any DAG sections
> that are not referenced by valid refs which are older than the GC
> timeout.

Yes. So the way to "forget" about some history is to stop referencing
it. And then, after a grace period, it will be removed.

> Are you saying this GC process will never collect source diffs in the
> DAG?

No, but it will only remove unreferenced things. And things only become
unreferenced through explicit user action. So you don't have to worry
about git GCing your work unexpectedly. You do have to worry about git
GCing things you have explicitly told it to delete.

-Peff

^ permalink raw reply	[flat|nested] 64+ messages in thread

[parent not found: <willow-jeske-01l5PFjPFEDjCfzf-01l5kv6TFEDjCj8S>]

* Re: why is git destructive by default? (i suggest it not be!)
       [not found]                             ` <willow-jeske-01l5PFjPFEDjCfzf-01l5kv6TFEDjCj8S>
@ 2008-06-24  8:30                               ` David Jeske
  2008-06-24  9:39                                 ` Jakub Narebski
  2008-06-24  8:30                               ` David Jeske
  1 sibling, 1 reply; 64+ messages in thread
From: David Jeske @ 2008-06-24  8:30 UTC (permalink / raw)
  To: Jeff King; +Cc: Avery Pennarun, Nicolas Pitre, git

This is mostly moot since I've understood that it's easy to set git to never
GC. I guess I'm curious about why those GC fields would ever be set to anything
other than never?

-- Jeff King wrote:
> No. Git keeps the reachable DAG. So if the DAG is part of development
> that is merged into one of your long running branches, or if you keep
> around the branch that points to it, it will never go away.

Right, that's what I thought.

I'm not primarily concerned with what developers can do to their local git
repositories. I'm concerned with what the default sync operations can let them
do to the crown-jewels in the 'central organization repositories' which
everyone is periodically pushing to.

I like that deleting a branch in your repo does not cause it to be deleted in
other repos. Presumably in an  organization we could prevent the central repo
from ever accepting branch deletes from developers. (without some kind of
authorization)

Does it have the same protection for all operations that can cause DAGs to be
dangling? For example, if they branch -f" and push the branch?

---

Again it's simple enough for me to just set the GC times to "never" on the
server, and I find git pretty pleasing because I'm a
short-attention-span-comitter. On a perforce or cvs repository, I frequently
tar up subtrees between commits, so i don't lose my work -- git is light-years
ahead of this.

Quite a bit of my fear of losing data came from some issues in the git-gui. I'm
trying out git on a windows project, and windows-shells just don't work right,
so I'm using the "Git Gui". It turns out right-clicking on a history entry in
the gui has no checkout option, and the only option it does have which will let
you move the tree to that place is "reset --hard".. since this was the easiest
thing to find in the GUI, I assumed it was the right way to do it, and then all
my more recent changes disappeared. It doesn't seem to have reflog
functionality, so I couldn't find any way to get back all my changes. I ended
up having an old history window that I did another reset --head in back to the
latest change, but I got scared about what git was doing underneath. The docs
clearly explained that it will garbage collect dangling refs, and frankly the
information about how often this happens is buried so deep I had no idea what
the frequency was.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
  2008-06-24  8:30                               ` David Jeske
@ 2008-06-24  9:39                                 ` Jakub Narebski
  0 siblings, 0 replies; 64+ messages in thread
From: Jakub Narebski @ 2008-06-24  9:39 UTC (permalink / raw)
  To: David Jeske; +Cc: Jeff King, Avery Pennarun, Nicolas Pitre, git

"David Jeske" <jeske@google.com> writes:

> This is mostly moot since I've understood that it's easy to set git
> to never GC. I guess I'm curious about why those GC fields would
> ever be set to anything other than never?

Because not everybody has unlimited quota / unlimited disk space?
Besides growing repository, reflogs also grow even if you shitch
between some limited set of commits.

Note however that IIRC reflogs are not enabled by default for bare
repositories, and public repositories should be bare (without working
directory).  But see receive.denyNonFastForwards below.

> -- Jeff King wrote:
> >
> > No. Git keeps the reachable DAG. So if the DAG is part of development
> > that is merged into one of your long running branches, or if you keep
> > around the branch that points to it, it will never go away.
> 
> Right, that's what I thought.
> 
> I'm not primarily concerned with what developers can do to their
> local git repositories. I'm concerned with what the default sync
> operations can let them do to the crown-jewels in the 'central
> organization repositories' which everyone is periodically pushing
> to.
> 
> I like that deleting a branch in your repo does not cause it to be
> deleted in other repos. Presumably in an organization we could
> prevent the central repo from ever accepting branch deletes from
> developers. (without some kind of authorization)
> 
> Does it have the same protection for all operations that can cause
> DAGs to be dangling? For example, if they branch -f" and push the
> branch?

git-config(1)

  receive.denyNonFastForwards::
        If set to true, git-receive-pack will deny a ref update which is
        not a fast forward. Use this to prevent such an update via a push,
        even if that push is forced. This configuration variable is
        set when initializing a shared repository.

That is even more than protection against leaving some commits
dangling.  This makes working on top of published branches safe.

If such all-or-nothing policy is not for you, you can always set-up
hooks, like shown for example in contrib/hooks/update-paranoid

Or you can use different workflow, where maintainer _pulls_ from other
developers or groups of developers, or apply (git-am) patches from
email.  This way if you screw up, it would be your fault for not
having backups ;-)

[...]
> Quite a bit of my fear of losing data came from some issues in the
> git-gui. I'm trying out git on a windows project, and windows-shells
> just don't work right, so I'm using the "Git Gui". It turns out
> right-clicking on a history entry in the gui has no checkout option,

This might be result of the fact that in older versions of git you
could not checkout arbitrary commit.  You now can use so called
"detached HEAD" (when current branch pointer points directly to the
commit, instead of pointing to current branch [name]); note however
that comitting on top of detached HEAD is discouraged.

> and the only option it does have which will let you move the tree to
> that place is "reset --hard".. since this was the easiest thing to
> find in the GUI, I assumed it was the right way to do it, and then
> all my more recent changes disappeared. It doesn't seem to have
> reflog functionality, so I couldn't find any way to get back all my
> changes.

There is always ORIG_HEAD, which predates reflog introduction, and
contains only old "version", as in

  $ git reset --hard ORIG_HEAD


That said, it would be nice if git-gui had some reflog interface.

> [...] The docs clearly explained that it
> will garbage collect dangling refs, and frankly the information
> about how often this happens is buried so deep I had no idea what
> the frequency was.

git-gc(1), section called (suprise, suprise) "Configuration".

-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
       [not found]                             ` <willow-jeske-01l5PFjPFEDjCfzf-01l5kv6TFEDjCj8S>
  2008-06-24  8:30                               ` David Jeske
@ 2008-06-24  8:30                               ` David Jeske
  1 sibling, 0 replies; 64+ messages in thread
From: David Jeske @ 2008-06-24  8:30 UTC (permalink / raw)
  To: Jeff King; +Cc: Avery Pennarun, Nicolas Pitre, git

This is mostly moot since I've understood that it's easy to set git to never
GC. I guess I'm curious about why those GC fields would ever be set to anything
other than never?

-- Jeff King wrote:
> No. Git keeps the reachable DAG. So if the DAG is part of development
> that is merged into one of your long running branches, or if you keep
> around the branch that points to it, it will never go away.

Right, that's what I thought.

I'm not primarily concerned with what developers can do to their local git
repositories. I'm concerned with what the default sync operations can let them
do to the crown-jewels in the 'central organization repositories' which
everyone is periodically pushing to.

I like that deleting a branch in your repo does not cause it to be deleted in
other repos. Presumably in an  organization we could prevent the central repo
from ever accepting branch deletes from developers. (without some kind of
authorization)

Does it have the same protection for all operations that can cause DAGs to be
dangling? For example, if they branch -f" and push the branch?

---

Again it's simple enough for me to just set the GC times to "never" on the
server, and I find git pretty pleasing because I'm a
short-attention-span-comitter. On a perforce or cvs repository, I frequently
tar up subtrees between commits, so i don't lose my work -- git is light-years
ahead of this.

Quite a bit of my fear of losing data came from some issues in the git-gui. I'm
trying out git on a windows project, and windows-shells just don't work right,
so I'm using the "Git Gui". It turns out right-clicking on a history entry in
the gui has no checkout option, and the only option it does have which will let
you move the tree to that place is "reset --hard".. since this was the easiest
thing to find in the GUI, I assumed it was the right way to do it, and then all
my more recent changes disappeared. It doesn't seem to have reflog
functionality, so I couldn't find any way to get back all my changes. I ended
up having an old history window that I did another reset --head in back to the
latest change, but I got scared about what git was doing underneath. The docs
clearly explained that it will garbage collect dangling refs, and frankly the
information about how often this happens is buried so deep I had no idea what
the frequency was.

^ permalink raw reply	[flat|nested] 64+ messages in thread

[parent not found: <willow-jeske-01l5PFjPFEDjCfzf-01l5kv6TFEDjCj8S@brm-avmta-1.central.sun.com>]

[parent not found: <willow-jeske-01l5lTEoFEDjCVta@brm-avmta-1.central.sun.com>]

* Re: why is git destructive by default? (i suggest it not be!)
       [not found]                               ` <willow-jeske-01l5lTEoFEDjCVta@brm-avmta-1.central.sun.com>
@ 2008-06-24 10:01                                 ` Fedor Sergeev
  2008-06-24 10:24                                   ` David Jeske
  0 siblings, 1 reply; 64+ messages in thread
From: Fedor Sergeev @ 2008-06-24 10:01 UTC (permalink / raw)
  To: David Jeske; +Cc: git

On Tue, 24 Jun 2008, David Jeske wrote:
> This is mostly moot since I've understood that it's easy to set git to never
> GC. I guess I'm curious about why those GC fields would ever be set to anything
> other than never?

On Tue, 24 Jun 2008, David Jeske wrote:
> My philosophy is simple, I never never
> never want to throw away changes, you shouldn't either. Disks are cheaper than
> programmer hours. I can understand wanting to keep things tidy, so I can
> understand ways to correct the 'easily visible changes', and also avoid pushing
> them to other trees, but I don't understand why git needs to delete things.

It looks like you are severely restricting your own way of thinking about
a source code management as a source code backup system only.

While this might be a valid mindset for a gatekeeper on a public 
repository it way way restrictive for a developer that wants to have a 
system that helps him doing a job.
And, say, for me, for my own job, ability to experiment *safely* and 
effectively, ability to try out different histories is the most valuable
asset that git brings to the world of SCMs.

My collegues that were forced to use Mercurial for their job are really 
unhappy about Mercurial's habbit of not modifying history at all.
After a certain amount of time just looking at the history of an actively 
developed project causes a headache.

When you speak about allowing/disallowing destructive actions you actually
speak about policies.
Different organizations, different repositories have different policies.
And git is very flexible in allowing you to implement all those different
policies as you wish it.

And whether default policy should allow people to experiment freely or not
is a very delicate question, which I would not really have enough courage
to speculate on.

regards,
   Fedor.

P.S. Saying all that, I would really like to have an easy way to tie non-default
policies to repositories so it propagates on clones. It is really helpful
in big organizations. But thats another story.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
  2008-06-24 10:01                                 ` Fedor Sergeev
@ 2008-06-24 10:24                                   ` David Jeske
  2008-06-24 13:13                                     ` Theodore Tso
  0 siblings, 1 reply; 64+ messages in thread
From: David Jeske @ 2008-06-24 10:24 UTC (permalink / raw)
  To: Fedor Sergeev; +Cc: git

On Tue, Jun 24, 2008 at 3:01 AM, Fedor Sergeev <Fedor.Sergeev@sun.com> wrote:
> It looks like you are severely restricting your own way of thinking about
> a source code management as a source code backup system only.
>
> While this might be a valid mindset for a gatekeeper on a public repository
> it way way restrictive for a developer that wants to have a system that
> helps him doing a job.

Odd. I've never been a gatekeeper. I'm just a developer who has burned
himself enough times that I want a tool (i.e. source control) to help
prevent me from ever destroying anything I create. I like that git is
doing nicer things with merge tracking than older systems, and that
it's easier for distributed teams to move changes around in more
interesting ways than "up to the server" and "down from the server".
However, I also want it to provide the guarantee that "if I don't
touch the files in .git, it'll never lose my commits", which sadly
isn't true by default. I'm glad I can easily change the GC policy, but
I question why this isn't the default.

In another discussion about this, one of my coworkers pointed out that
making the GC default "never" would be much safer for new users, and
new users don't really need to worry about collecting things until
their repositories get bigger anyhow.

I also think that it would be simpler to understand for everyone if
every operation which can cause a dangling graph node require the
exact same override method (i.e. -f is fine, the capitalization as in
-d -> -D is fine, some --force or --hard is fine, but currently the
system is using three different methods in three different places)

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
  2008-06-24 10:24                                   ` David Jeske
@ 2008-06-24 13:13                                     ` Theodore Tso
  0 siblings, 0 replies; 64+ messages in thread
From: Theodore Tso @ 2008-06-24 13:13 UTC (permalink / raw)
  To: David Jeske; +Cc: Fedor Sergeev, git

On Tue, Jun 24, 2008 at 03:24:00AM -0700, David Jeske wrote:
> Odd. I've never been a gatekeeper. I'm just a developer who has burned
> himself enough times that I want a tool (i.e. source control) to help
> prevent me from ever destroying anything I create.

It sounds like the main problem is that you need to learn more about
how to use the your tools.  If you use the tools right, the number of
times that you you'll accidentally overwrite a branch pointer is quite
rare; and generally you notice right away; the default GC period of 30
days is a L-O-N-G time, and in practice its more than enough time for
someone to notice that they screwed up.

So a couple of tips

1) "git reflog show <branch name>" is a great way to only look at
changes to a particular branch.  ("git log -g" or "git reflog show"
defaults to showing the reflog for HEAD)

2) A number of accidents with "git rebase" happen because people
forget which branch they are on.  So having your command line prompt
tell you which branch you are on is really helpful.  Google "git
prompt shell" for some examples of how to do this.

I do something like this:

function __prompt_git()
{
	local git_dir ref br top;
	git_dir=$(git-rev-parse --git-dir 2> /dev/null) || return
	ref=$(git-symbolic-ref HEAD 2> /dev/null) || return
	br=${ref#refs/heads/}
	top=$(cat $git_dir/patches/$br/current 2>/dev/null) \
		  && top="/$top"
		  echo "[$br$top]"
}

if [ $UID = 0 ]; then
u="${LOGNAME}.root"
p="#"
else
u="$LOGNAME";
p="%"
fi
if [ $SHLVL != 1 ]; then
s=", level $SHLVL"
fi
PS1="<${u}@${HOSTNAME}> {\${PWD}}$s  \$(__prompt_git)\n\!$p "
unset u s

							- Ted

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
       [not found]                       ` <willow-jeske-01l5PFjPFEDjCfzf-01l5jmMuFEDjChvB>
  2008-06-24  7:31                         ` David Jeske
@ 2008-06-24  7:31                         ` David Jeske
  1 sibling, 0 replies; 64+ messages in thread
From: David Jeske @ 2008-06-24  7:31 UTC (permalink / raw)
  To: Jeff King; +Cc: Avery Pennarun, Nicolas Pitre, git

-- Jeff King wrote:
> I think you are confusing two aspects of history.
>
> There is the commit DAG, which says "at some time T, the files were at
> some state S, and the commit message by author A was M". And those
> commits form a chain so you can see how the state of the files
> progressed. And anything that is reachable through that history will

okay.

> always be kept by git, and you can always go back to any point.

..are you saying that if I reset --hard, or delete a branch ref, or do a
rebase, and then do a GC beyond the GC timeout, that git will NEVER throw away
any of those DAGs? (the actual source diffs committed)

> And the ref history is what gets garbage collected. Most people are fine
> with that, because they care about the actual commit history, and the
> reflog is just a convenient way of saying "oops, what was happening
> yesterday?" But if you really care, then by all means, set the reflog
> expiration much higher.

My (possibly flawed) understanding was that it drops any DAG sections that are
not referenced by valid refs which are older than the GC timeout.

It came from wording like this in the docs:

"The optional configuration variable gc.reflogExpireUnreachable
can be set to indicate how long historical reflog entries which
are not part of the current branch should remain available in
this repository. These types of entries are generally created
as a result of using git commit --amend or git rebase and are the
commits prior to the amend or rebase occurring. Since
these changes are not part of the current project most users
^^^^^^^^^^^^^
will want to expire them sooner. This option defaults to 30 days."

In the above, I resolve "these changes" to "commits prior to the amend" in the
previous sentence.

"git-gc tries very hard to be safe about the garbage it collects.
In particular, it will keep not only objects referenced by your
current set of branches and tags, but also objects referenced by
the index, remote tracking branches, refs saved by
git-filter-branch(1) in refs/original/, or reflogs (which may
references commits in branches that were later amended or rewound)."

In the above, I resolve "keep .. only objects referenced by your current set of
branches and tags [and some other stuff]" to "commmits in the DAG pointed to by
refs [and other stuff]".
Are you saying this GC process will never collect source diffs in the DAG?

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
       [not found]                 ` <willow-jeske-01l5PFjPFEDjCfzf-01l5gtQ7FEDjCWCC>
  2008-06-24  6:35                   ` David Jeske
@ 2008-06-24  6:35                   ` David Jeske
  1 sibling, 0 replies; 64+ messages in thread
From: David Jeske @ 2008-06-24  6:35 UTC (permalink / raw)
  To: Avery Pennarun; +Cc: Nicolas Pitre, git

Thanks for all the helpful responses...

-- Avery Pennarun wrote:
> git's philosophy is different. Branches are really just "temporary
> tags". A merge operation doesn't just copy data from one branch to
> another: it actually joins the two histories together, so you can then
> trace back through the exact history of the merged branches, commit by
> commit. "git log" will show each checkin to *either* branch
> individually, instead of just one big "merge" checkin.

If branches are "temporary tags" how do I see the actual code they had working
in their branch before they merged it?

I'm reading about rebase, and it sounds like something I would want to forever
disallow on my git repository, because it looks like it rewrites history and
makes it impossible to get to the state of the tree they actually had working
before the merge. However, something you say below both clarifies and confuses
this.

Am I understanding this wrong?

> The end result is that even if you delete the source branch after
> doing a merge, nothing is actually lost.

..and what if you never merge? That branch-pointer points to useful information
about a development attempt, but it was never merged. (imagine a different
development path was taken) They never created a tag because it's not clear
when that work was "done" (unlike a release, which is much more well
understood). What prevents someone from deleting the branch-pointer or moving
it to a different part of the tree, causing that set of changes to be a
dangling ref lost in a sea of refs. Later when someone goes back looking for
it, how would they ever find it in a sea of tens of thousands of checkins?

> Thus, there's no reason for git to try to make branches impossible
> to lose, as they are in svn.

Before I set the GC times to "100 years", there was a HUGE reason for git to
make those branch-pointers impossible to lose, because by default if you lose
them git actually garbage collects them and throws the diffs away after 90
days!

> Another way to think of it is that svn's concept of a "branch" is
> actually the "reflog" in git. (svn records which data a particular
> branch name points to over time, just like git's reflog does.) git
> branches are something else entirely; a git branch always points at
> only a single commit, and has no history of its own.

That's sort of helpful, and sort of confusing. I think of git's branches as
"branch pointers to the head of a linked-list of states of the tree".

As long as you keep those refs without deleting them, and you keep that branch
pointer to the head, you can walk back through the history of that branch. If
multiple developers are working in the branch (and not using rebase, and not
garbage collecting), can't you even go track down the working state of their
local clients while they were working before they merged?

If I'm understanding all that right, it's exactly the kind of functionality I
want -- the ability to reproduce the state of all working history, exactly as
it was when the code was actually working in someone's client a long time ago,
before they merged it to the mainline. Except the standard model seems to be to
let the system "garbage collect" all that history, and toss it away as
unimportant -- and in some cases it seems to even provide developers with ways
to more aggressively assure garbage collection makes it disappear.

Am I expecting too much out of git? It doesn't really feel like a source
control system for an organization that wants to save everything, forever, even
when those people and trees and home directories disappear. It feels like a
distributed patch manager that is much more automatic than sending around
diffs, but isn't overly concerned with providing access to old history. (which,
duh, is no surprise given that's what I expect it's doing for linux kernel)

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
  2008-06-24  5:20               ` Avery Pennarun
       [not found]                 ` <willow-jeske-01l5PFjPFEDjCfzf-01l5gtQ7FEDjCWCC>
@ 2008-06-24  7:54                 ` Jakub Narebski
       [not found]                   ` <willow-jeske-01l5PFjPFEDjCfzf-01l5kQf4FEDjCXUa>
  1 sibling, 1 reply; 64+ messages in thread
From: Jakub Narebski @ 2008-06-24  7:54 UTC (permalink / raw)
  To: Avery Pennarun; +Cc: David Jeske, Nicolas Pitre, git

It looks like for some reason not all messages made it to git mailing
list, at least when using GMane to read git mailing list.  Strange...

"Avery Pennarun" <apenwarr@gmail.com> writes:
> On 6/24/08, David Jeske <jeske@google.com> wrote:

>> I moved a branch around and then deleted it, and I don't see any
>> record in the reflog of where it was, or that it ever was.

Deleting branch (BTW. git prints warning when deleting branch can
result in [temporary] loss of [easy access to] some commits) deletes
its reflog[*1*], but you can still use HEAD reflog ("what was checked
out" reflog).

>> Am I missing something about how branches are used? I see some
>> language in "git tag" about how attempts are made to assure that
>> others can't move around semi-immutable tags during push, but I
>> don't see any such language about branches. What prevents someone
>> from accidentally deleting an old branch that nobody is watching,
>> but is important to the history and then not noticing as gc
>> silently deletes the old deltas?

BTW. branches _deletions_ are not by default transferred (even if
using globbing refspecs, which is not default); you have to use 
"git remote prune <remote nick>" to remove remote-tracking branches
which track branches that got deleted on remote.

Besides nobody and nothing can fully protect you from your stupidity.
You can "accidentally" do 'rm -rf .git' for example :-/

>> I've had need to pull out versions several years old multiple times
>> in my career, so this is the kind of thing I'm thinking about.

The answer is: don't delete branches accidentally ;-).

Seriously, in any sane workflow you have several long lasting
branches, be it 'maint', 'master', 'next' or be it 'maintenance',
'stable'/'mainline'/'trunk', 'devel', into whose you merge in
[temporary, short lived] topic branches when topic is ready for
inclusion.  And you NEVER delete such branches (git can't protect you
from deletion any more than Linux can protect you if you do "rm -rf ~"). 

Any commit for whose there is parentage line from one of those
long-lived "development" branches would be protected from pruning
during git-gc run.

> git branches are actually a very different concept from branches in,
> say, subversion.
> 
> In subversion, a branch is normally created so that you can do
> parallel development, and then you merge whole batches of changes
> (with 'svn merge') from one branch into another.  When you do this,
> you create a single new commit in the destination branch that contains
> *all* the changes.  So if you want to look back in history to see who
> did which part of the change for what reason, you have to go back to
> the branch you merged *from*.  Thus, it's very important in subversion
> that old branches never disappear.
> 
> git's philosophy is different.  Branches are really just "temporary
> tags".

I'd rather say thay branches (refs/heads branches) are "growth points"
of graph (diagram) of revisions (versions).  (This graph is called DAG
in git documentation, because it is Directed Acyclic Graph).

But it is true that in git branches are just _pointers_ to the DAG
of commits.  All data is kept in the content addressed object database
which is git repo storage, and parentage links are contained in commit
objects.

> A merge operation doesn't just copy data from one branch to
> another: it actually joins the two histories together, so you can then
> trace back through the exact history of the merged branches, commit by
> commit.  "git log" will show each checkin to *either* branch
> individually, instead of just one big "merge" checkin.

Let me help explain that using some ASCII-art diagram.  You need to
use fixed-width (non-proportional) font to view it correctly.  Time
flows from the left to right.

Let's assume that we have the following state: some history on branch
'master': 

       object database              refs information
    /-------------------\        /---------------------\

     .<---.<---.<---A             <--- master <=== HEAD

For the commits the "<---" arrow means that commit on the right side
of arrow has commit on the left hand side of arrow as its parent
(saved in the multi-valued "parent" field in the commit object).  For
the references "<---" arrow means that branch master points to given
commit, and "<===" means symbolic reference, i.e. that ref points to
given branch (you can think of it as symlink, and it was some time ago
implemented as such).

Now assume that we created new branch 'test', and we have comitted
some revisions being on it:

     .<---.<---.<---A             <--- master
                     \
                      \-B<---C    <--- test     <=== HEAD

Let's assume that we, or somebody else, did some work on 'master'
branch (to not confuse you with the "fast-formward" issue):

     .<---.<---.<---A<---X<---Y    <--- master
                     \
                      \--B<---C    <--- test     <=== HEAD

Now we have finished feature which we tried to develop in 'test', so
we merge changes back to 'master':

     .<---.<---.<---A<---X<---Y<---M       <--- master <=== HEAD
                     \            /
                      \--B<---C<-/         <--- test

Note how merge commit 'M' has two parents.

Now if we were to delete branch 'test' now:

     .<---.<---.<---A<---X<---Y<---M       <--- master [<=== HEAD]
                     \            /
                      \--B<---C<-/

it is only pointer that gets deleted (and reflog[*1*]).  All commits
which were on this branch are 'reachable', so they never would get
deleted, even if [HEAD] reflog expires[*2*].

> The end result is that even if you delete the source branch after
> doing a merge, nothing is actually lost.  Thus, there's no reason for
> git to try to make branches impossible to lose, as they are in svn.
> In the event that you really needed that branch pointer, it's in the
> reflog, as a few people have pointed out.

s/in the reflog/in the HEAD reflog/.

See above for explanation with pictures (or if you want some graphics,
take a look at presentations linked from GitLinks page and/or
GitDocumentation page on git wiki, http://git.or.cz/gitwiki/).

HTH

Footnotes:
==========
[*1*] There was an effort to create some sort of 'Attic' / 'trash can'
for deleted reflogs, but I guess it got stalled.  There is techical
issue caused by the fact that reflogs are stored as files, and you can
have so caled file<->directory conflict, when you deleted branch
'foo', and created branch 'foo/bar'.

[*2*] You can always write "never" as time to expire, and it even
works now ;-)
-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 64+ messages in thread

[parent not found: <willow-jeske-01l5PFjPFEDjCfzf-01l5kQf4FEDjCXUa>]

* Re: why is git destructive by default? (i suggest it not be!)
       [not found]                   ` <willow-jeske-01l5PFjPFEDjCfzf-01l5kQf4FEDjCXUa>
@ 2008-06-24  8:08                     ` David Jeske
  2008-06-24  8:08                     ` David Jeske
  1 sibling, 0 replies; 64+ messages in thread
From: David Jeske @ 2008-06-24  8:08 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Avery Pennarun, Nicolas Pitre, git

To re-ask the same question I asked in my last post, using your ascii
pictures...


Let's assume we're here..

.<---.<---.<---A<---X<---Y    <--- master
\
\--B<---C    <--- customer_A_branch <=== HEAD


And this person and everyone else moves their head pointers back to master
without merging:


.<---.<---.<---A<---X<---Y    <--- master              <=== HEAD
\
\--B<---C    <--- customer_A_branch


Now, five years down the road, our tree looks like:


.<---A<---X<---Y<---.<--.<--.(3 years of changes)<---ZZZ<--- master  <=== HEAD
\
\--B<---C   <--- customer_A_branch

And someone does:

git-branch -f customer_A_branch ZZZ

To bring us to:

.<---A<---X<---Y<---.<--.(3 years of changes)<---ZZZ<--- master  <=== HEAD
\                                           \
\--B<---C                                   \-- customer_A_branch


..at this point, will a GC keep "B<--C", or garbage collect the commits and
throw them away?

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
       [not found]                   ` <willow-jeske-01l5PFjPFEDjCfzf-01l5kQf4FEDjCXUa>
  2008-06-24  8:08                     ` David Jeske
@ 2008-06-24  8:08                     ` David Jeske
  2008-06-24 11:22                       ` Jakub Narebski
  1 sibling, 1 reply; 64+ messages in thread
From: David Jeske @ 2008-06-24  8:08 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Avery Pennarun, Nicolas Pitre, git

To re-ask the same question I asked in my last post, using your ascii
pictures...


Let's assume we're here..

.<---.<---.<---A<---X<---Y    <--- master
\
\--B<---C    <--- customer_A_branch <=== HEAD


And this person and everyone else moves their head pointers back to master
without merging:


.<---.<---.<---A<---X<---Y    <--- master              <=== HEAD
\
\--B<---C    <--- customer_A_branch


Now, five years down the road, our tree looks like:


.<---A<---X<---Y<---.<--.<--.(3 years of changes)<---ZZZ<--- master  <=== HEAD
\
\--B<---C   <--- customer_A_branch

And someone does:

git-branch -f customer_A_branch ZZZ

To bring us to:

.<---A<---X<---Y<---.<--.(3 years of changes)<---ZZZ<--- master  <=== HEAD
\                                           \
\--B<---C                                   \-- customer_A_branch


..at this point, will a GC keep "B<--C", or garbage collect the commits and
throw them away?

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
  2008-06-24  8:08                     ` David Jeske
@ 2008-06-24 11:22                       ` Jakub Narebski
       [not found]                         ` <willow-jeske-01l5PFjPFEDjCfzf-01l5p7eVFEDjCZRD>
  2008-06-24 12:13                         ` Jakub Narebski
  0 siblings, 2 replies; 64+ messages in thread
From: Jakub Narebski @ 2008-06-24 11:22 UTC (permalink / raw)
  To: David Jeske; +Cc: Avery Pennarun, Nicolas Pitre, git

David Jeske wrote:

> To re-ask the same question I asked in my last post, using your ascii
> pictures...
> 
> 
> Let's assume we're here..
> 
> .<---.<---.<---A<---X<---Y    <--- master
>  \
>   \--B<---C                   <--- customer_A_branch <=== HEAD
> 
> 
> And this person and everyone else moves their head pointers back
> to master without merging:

You could simply say: they stop working on 'customer_A_branch' branch
(moving HEAD poter is simply switching to / checking out / working on
different branch).

> .<---.<---.<---A<---X<---Y    <--- master              <=== HEAD
>  \
>   \--B<---C                   <--- customer_A_branch
> 
> 
> Now, five years down the road, our tree looks like:
> 
> 
> .<---.<---.<---A<---X<---Y<--.(3 years of changes).--ZZZ  <--- master  <=== HEAD
>  \
>   \--B<---C   <--- customer_A_branch
> 
> And someone does:
> 
> git-branch -f customer_A_branch ZZZ

If they are using '-f', i.e. force, they should know and be sure what
they are doing; it is not much different from 'rm -f *'.

If reflog for 'customer_A_branch' expired it would be hard to go back
to old 'customer_A_branch', and impossible after garbage collector
pruned history.

What you _should do_, if you want to preserve old 'customer_A_branch'
pointer is to *tag* it, e.g. something like 'Attic/customer_A_branch';
if you use annotated tags you can even state why do you want to keep
old work, and why old work wasn't merged into long-lived branch, and
why the work was abandoned.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 64+ messages in thread

[parent not found: <willow-jeske-01l5PFjPFEDjCfzf-01l5p7eVFEDjCZRD>]

* Re: why is git destructive by default? (i suggest it not be!)
       [not found]                         ` <willow-jeske-01l5PFjPFEDjCfzf-01l5p7eVFEDjCZRD>
@ 2008-06-24 11:29                           ` David Jeske
  2008-06-24 12:21                             ` Jakub Narebski
  2008-06-24 11:29                           ` David Jeske
  1 sibling, 1 reply; 64+ messages in thread
From: David Jeske @ 2008-06-24 11:29 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Avery Pennarun, Nicolas Pitre, git

-- Jakub Narebski wrote:
> If they are using '-f', i.e. force, they should know and be sure what
> they are doing; it is not much different from 'rm -f *'.

Sure, no problem. I don't want the ability to "rm -f *". I'm raising my hand
and saying "I don't want the power to do these things, so just turn off all the
git commands that could be destructive and give me an alternate way to do the
workflows I need to do". Just like a normal user on a unix machine doesn't run
around with the power to rm -f /etc all the time, even though they may be able
to su to root.

Let me guess, you're always running euid==0. :)

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
  2008-06-24 11:29                           ` David Jeske
@ 2008-06-24 12:21                             ` Jakub Narebski
  0 siblings, 0 replies; 64+ messages in thread
From: Jakub Narebski @ 2008-06-24 12:21 UTC (permalink / raw)
  To: David Jeske; +Cc: Avery Pennarun, Nicolas Pitre, git

David Jeske wrote:
> -- Jakub Narebski wrote:
>>
>> If they are using '-f', i.e. force, they should know and be sure what
>> they are doing; it is not much different from 'rm -f *'.

By the way, reflog (even if expired) would protect you in this
situation; I have checked wrongly that it does not (chronological
vs. reverse chronological order, and not paying attention to
timestamps).

> Sure, no problem. I don't want the ability to "rm -f *". [...]

It is very useful command when deleting larger number of files;
I have "alias rm='rm -i'", and confirming every single file quickly
gets annoying.

> Just like a normal user on a unix machine doesn't run 
> around with the power to rm -f /etc all the time, even though they may be able
> to su to root.

Example was about "rm -f *", i.e. removing contents of current directory;
you should be careful when doing it, for example if you are in currect
repository.

Some older versions of UNIX supposedly could hose every hidden file you own
upwards if you did "rm -rf .*", as they matched '..' (parent directory)
against '.*'.

> Let me guess, you're always running euid==0. :)

No.  I almost never login as root, using 'sudo', 'sudo su -', or relying
on applications asking for root credentials if required (for example when
installing new version of git).

Let me guess: no sharp knives in kitchen? ;-P
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
       [not found]                         ` <willow-jeske-01l5PFjPFEDjCfzf-01l5p7eVFEDjCZRD>
  2008-06-24 11:29                           ` David Jeske
@ 2008-06-24 11:29                           ` David Jeske
  2008-06-24 12:19                             ` Rogan Dawes
  1 sibling, 1 reply; 64+ messages in thread
From: David Jeske @ 2008-06-24 11:29 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Avery Pennarun, Nicolas Pitre, git

-- Jakub Narebski wrote:
> If they are using '-f', i.e. force, they should know and be sure what
> they are doing; it is not much different from 'rm -f *'.

Sure, no problem. I don't want the ability to "rm -f *". I'm raising my hand
and saying "I don't want the power to do these things, so just turn off all the
git commands that could be destructive and give me an alternate way to do the
workflows I need to do". Just like a normal user on a unix machine doesn't run
around with the power to rm -f /etc all the time, even though they may be able
to su to root.

Let me guess, you're always running euid==0. :)

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
  2008-06-24 11:29                           ` David Jeske
@ 2008-06-24 12:19                             ` Rogan Dawes
  2008-06-24 12:35                               ` Johannes Gilger
  0 siblings, 1 reply; 64+ messages in thread
From: Rogan Dawes @ 2008-06-24 12:19 UTC (permalink / raw)
  To: David Jeske; +Cc: Jakub Narebski, Avery Pennarun, Nicolas Pitre, git

David Jeske wrote:
> -- Jakub Narebski wrote:
>> If they are using '-f', i.e. force, they should know and be sure what
>> they are doing; it is not much different from 'rm -f *'.
> 
> Sure, no problem. I don't want the ability to "rm -f *". I'm raising my hand
> and saying "I don't want the power to do these things, so just turn off all the
> git commands that could be destructive and give me an alternate way to do the
> workflows I need to do". Just like a normal user on a unix machine doesn't run
> around with the power to rm -f /etc all the time, even though they may be able
> to su to root.
> 
> Let me guess, you're always running euid==0. :)

Do you also ask the gnu coreutils folks to remove the -f option from 
their utilities?

There is a basic assumption that folks that are using tools have at 
least made an attempt to understand what it is that they are doing, 
before e.g. waving a chainsaw around.

One thing that I haven't seen addressed in this thread is the fact that 
if you have a dirty working directory, and you "git reset --hard", 
whatever was dirty (not yet in the index, or committed) will be blown 
away, and no amount of reflog archeology will help you get it back.

Any changes that had been staged in the index WILL exist in the object 
directories as dangling objects, and can be retrieved through judicious 
use of "git fsck" and "git show", but will certainly be a painful 
exercise if there was an extensive set of changes.

Rogan

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
  2008-06-24 12:19                             ` Rogan Dawes
@ 2008-06-24 12:35                               ` Johannes Gilger
  2008-06-24 12:46                                 ` Rogan Dawes
  0 siblings, 1 reply; 64+ messages in thread
From: Johannes Gilger @ 2008-06-24 12:35 UTC (permalink / raw)
  To: Rogan Dawes
  Cc: David Jeske, Jakub Narebski, Avery Pennarun, Nicolas Pitre, git

On 24/06/08 14:19, Rogan Dawes wrote:
> One thing that I haven't seen addressed in this thread is the fact that if 
> you have a dirty working directory, and you "git reset --hard", whatever 
> was dirty (not yet in the index, or committed) will be blown away, and no 
> amount of reflog archeology will help you get it back.

I think the name of the command "reset" itself is a name which should 
prompt everyone to read a manpage before using it. I could understand 
that if "status" did something destructive people would get upset.
Other than that, git reset itself doesn't do anything destructive. Yeah, 
git reset --hard does, but hello, this is *reset* and *hard*, someone 
using this must really want what's about to happen. Nobody complaines 
about rm --force or anything.

As for putting safety-measure everywhere, I think that any further 
restricting of commands would be nonsense and just hindering the 
workflow. git is not something with a GUI and a recycle-bin. And it 
still is really hard to accidentaly lose anything in git.

Regards,
Jojo

-- 
Johannes Gilger <heipei@hackvalue.de>
http://hackvalue.de/heipei/
GPG-Key: 0x42F6DE81
GPG-Fingerprint: BB49 F967 775E BB52 3A81  882C 58EE B178 42F6 DE81

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
  2008-06-24 12:35                               ` Johannes Gilger
@ 2008-06-24 12:46                                 ` Rogan Dawes
  0 siblings, 0 replies; 64+ messages in thread
From: Rogan Dawes @ 2008-06-24 12:46 UTC (permalink / raw)
  To: Johannes Gilger
  Cc: David Jeske, Jakub Narebski, Avery Pennarun, Nicolas Pitre, git

Johannes Gilger wrote:
> On 24/06/08 14:19, Rogan Dawes wrote:
>> One thing that I haven't seen addressed in this thread is the fact that if 
>> you have a dirty working directory, and you "git reset --hard", whatever 
>> was dirty (not yet in the index, or committed) will be blown away, and no 
>> amount of reflog archeology will help you get it back.
> 
> I think the name of the command "reset" itself is a name which should 
> prompt everyone to read a manpage before using it. I could understand 
> that if "status" did something destructive people would get upset.
> Other than that, git reset itself doesn't do anything destructive. Yeah, 
> git reset --hard does, but hello, this is *reset* and *hard*, someone 
> using this must really want what's about to happen. Nobody complaines 
> about rm --force or anything.
> 
> As for putting safety-measure everywhere, I think that any further 
> restricting of commands would be nonsense and just hindering the 
> workflow. git is not something with a GUI and a recycle-bin. And it 
> still is really hard to accidentaly lose anything in git.
> 
> Regards,
> Jojo
> 

Right. I was simply pointing out to the original poster that for all the 
talk about reflogs, if you use "reset --hard", all bets are off. I was 
not complaining about the existence of that option, or its name . . .

I agree that adding nanny-guards to git would be counter productive.

Rogan

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: why is git destructive by default? (i suggest it not be!)
  2008-06-24 11:22                       ` Jakub Narebski
       [not found]                         ` <willow-jeske-01l5PFjPFEDjCfzf-01l5p7eVFEDjCZRD>
@ 2008-06-24 12:13                         ` Jakub Narebski
  1 sibling, 0 replies; 64+ messages in thread
From: Jakub Narebski @ 2008-06-24 12:13 UTC (permalink / raw)
  To: David Jeske; +Cc: Avery Pennarun, Nicolas Pitre, git

Jakub Narebski wrote:
> David Jeske wrote:

> > Now, five years down the road, [...] someone does:
> > 
> >  $ git-branch -f customer_A_branch ZZZ
> 
> If they are using '-f', i.e. force, they should know and be sure what
> they are doing; it is not much different from 'rm -f *'.
> 
> If reflog for 'customer_A_branch' expired it would be hard to go back
> to old 'customer_A_branch', and impossible after garbage collector
> pruned history.

Actually it is not true.  In the case of "git branch -f <branch>", which
is the case which wouldn't be covered by protecting reflogs when
deleting branches (saving them to some kind of "attic") git _saves_
old branch pointer to reflog, so "git log -g <branch>" would work
as expected.

The reflog entry looks like the following:

   0c52414d... 80b4c7e5.. A U Thor <author@example.com> 1214306246 +0200 \
	branch: Reset from ZZZ

(where of course there are full SHA-1 of commits, instead of shortened
ones, and everything is in single line, without line continuation.)
 
> What you _should do_, if you want to preserve old 'customer_A_branch'
> pointer is to *tag* it, e.g. something like 'Attic/customer_A_branch';
> if you use annotated tags you can even state why do you want to keep
> old work, and why old work wasn't merged into long-lived branch, and
> why the work was abandoned.

This of course is still valid.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 64+ messages in thread

end of thread, other threads:[~2008-06-26 15:17 UTC | newest]

Thread overview: 64+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-24  4:59 why is git destructive by default? (i suggest it not be!) Teemu Likonen
     [not found] ` <e80d075a0806232201o3933d154he2b570986604c30a@mail.gmail.com>
2008-06-24  5:43   ` Teemu Likonen
  -- strict thread matches above, loose matches on Subject: below --
2008-06-25 18:06 Dmitry Potapov
2008-06-24 12:21 Olivier Galibert
     [not found] <willow-jeske-01l5oEsvFEDjCjRW>
     [not found] ` <willow-jeske-01l5PFjPFEDjCfzf-01l5oEswFEDjCZBN>
2008-06-24 10:42   ` David Jeske
2008-06-24 15:29     ` Brandon Casey
     [not found]       ` <willow-jeske-01l5PFjPFEDjCfzf-01l5uqS9FEDjCcuF>
2008-06-24 16:41         ` David Jeske
2008-06-24 18:55           ` Brandon Casey
2008-06-25 12:20           ` Matthieu Moy
2008-06-25 17:56           ` Jing Xue
2008-06-24 16:41         ` David Jeske
2008-06-24 10:42   ` David Jeske
2008-06-24  8:35 Björn Steinbrink
     [not found] <jeske@willow=01l5V7waFEDjChmh>
     [not found] ` <willow-jeske-01l5PFjPFEDjCfzf-01l5V7wbFEDjCX7V>
2008-06-24  1:47   ` David Jeske
2008-06-24 17:11     ` Boaz Harrosh
2008-06-24 17:19       ` Boaz Harrosh
2008-06-24 19:08         ` Jakub Narebski
     [not found]           ` <willow-jeske-01l5PFjPFEDjCfzf-01l5zrLdFEDjCV3U>
2008-06-24 20:04             ` David Jeske
2008-06-24 20:04             ` David Jeske
2008-06-24 21:42               ` Brandon Casey
     [not found]                 ` <willow-jeske-01l5PFjPFEDjCfzf-01l63P33FEDjCVQ0>
2008-06-24 22:13                   ` David Jeske
2008-06-24 22:13                   ` David Jeske
2008-06-24 22:54                 ` Theodore Tso
2008-06-24 23:07                   ` Junio C Hamano
2008-06-25  2:26                     ` Theodore Tso
2008-06-25  8:58                       ` Jakub Narebski
2008-06-25  9:14                         ` Junio C Hamano
2008-06-26 15:13                       ` Brandon Casey
2008-06-24 22:21               ` Steven Walter
2008-06-25  8:57           ` Boaz Harrosh
2008-06-24 18:18       ` Brandon Casey
2008-06-24  1:47   ` David Jeske
     [not found] ` <willow-jeske-01l5PFjPFEDjCfzf-01l5V7wbFEDjCX7V@videotron.ca>
     [not found]   ` <willow-jeske-01l5cKsCFEDjC=91MX@videotron.ca>
2008-06-24  2:17     ` Nicolas Pitre
     [not found]       ` <willow-jeske-01l5PFjPFEDjCfzf-01l5ciVtFEDjCaD9>
2008-06-24  3:18         ` David Jeske
2008-06-24  8:14           ` Lea Wiemann
2008-06-24  3:18         ` David Jeske
     [not found]       ` <willow-jeske-01l5PFjPFEDjCfzf-01l5ciVtFEDjCaD9@videotron.ca>
     [not found]         ` <willow-jeske-01l5e9cgFEDjCh3F@videotron.ca>
2008-06-24  4:03           ` Nicolas Pitre
     [not found]             ` <willow-jeske-01l5PFjPFEDjCfzf-01l5fAcTFEDjCWA4>
2008-06-24  4:40               ` David Jeske
2008-06-24  4:40               ` David Jeske
2008-06-24  5:24                 ` Jan Krüger
     [not found]             ` <1978205964779154253@unknownmsgid>
2008-06-24  5:20               ` Avery Pennarun
     [not found]                 ` <willow-jeske-01l5PFjPFEDjCfzf-01l5gtQ7FEDjCWCC>
2008-06-24  6:35                   ` David Jeske
2008-06-24  7:24                     ` Jeff King
     [not found]                       ` <willow-jeske-01l5PFjPFEDjCfzf-01l5jmMuFEDjChvB>
2008-06-24  7:31                         ` David Jeske
2008-06-24  8:16                           ` Jeff King
     [not found]                             ` <willow-jeske-01l5PFjPFEDjCfzf-01l5kv6TFEDjCj8S>
2008-06-24  8:30                               ` David Jeske
2008-06-24  9:39                                 ` Jakub Narebski
2008-06-24  8:30                               ` David Jeske
     [not found]                             ` <willow-jeske-01l5PFjPFEDjCfzf-01l5kv6TFEDjCj8S@brm-avmta-1.central.sun.com>
     [not found]                               ` <willow-jeske-01l5lTEoFEDjCVta@brm-avmta-1.central.sun.com>
2008-06-24 10:01                                 ` Fedor Sergeev
2008-06-24 10:24                                   ` David Jeske
2008-06-24 13:13                                     ` Theodore Tso
2008-06-24  7:31                         ` David Jeske
2008-06-24  6:35                   ` David Jeske
2008-06-24  7:54                 ` Jakub Narebski
     [not found]                   ` <willow-jeske-01l5PFjPFEDjCfzf-01l5kQf4FEDjCXUa>
2008-06-24  8:08                     ` David Jeske
2008-06-24  8:08                     ` David Jeske
2008-06-24 11:22                       ` Jakub Narebski
     [not found]                         ` <willow-jeske-01l5PFjPFEDjCfzf-01l5p7eVFEDjCZRD>
2008-06-24 11:29                           ` David Jeske
2008-06-24 12:21                             ` Jakub Narebski
2008-06-24 11:29                           ` David Jeske
2008-06-24 12:19                             ` Rogan Dawes
2008-06-24 12:35                               ` Johannes Gilger
2008-06-24 12:46                                 ` Rogan Dawes
2008-06-24 12:13                         ` Jakub Narebski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).