[RFD] Rewriting safety - warn before/when rewriting published history

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RFD] Rewriting safety - warn before/when rewriting published history
@ 2012-02-04 19:45 Jakub Narebski
  2012-02-05 14:33 ` Ben Walton
                   ` (2 more replies)
  0 siblings, 3 replies; 34+ messages in thread
From: Jakub Narebski @ 2012-02-04 19:45 UTC (permalink / raw)
  To: git

Git includes protection against rewriting published history on the 
receive side with fast-forward check by default (which can be 
overridden) and various receive.deny* configuration variables,
including receive.denyNonFastForwards.

Nevertheless git users requested (among others in Git User's Survey) 
more help on creation side, namely preventing rewriting parts of 
history which was already made public (or at least warning that one is 
about to rewrite published history).  The "warn before/when rewriting 
published history" answer in "17. Which of the following features would 
you like to see implemented in git?" multiple-choice question in latest
Git User's Survey 2011[1] got 24% (1525) responses.

[1]: https://www.survs.com/results/Q5CA9SKQ/P7DE07F0PL

So people would like for git to warn them about rewriting history before 
they attempt a push and it turns out to not fast-forward.

What prompted this email is the fact that Mercurial includes support for 
tracking which revisions (changesets) are safe to modify in its 2.1 
latest version:

  http://lwn.net/Articles/478795/
  http://mercurial.selenic.com/wiki/WhatsNew

It does that by tracking so called "phase" of a changeset (revision).

  http://mercurial.selenic.com/wiki/Phases
  http://mercurial.selenic.com/wiki/PhasesDevel

  http://www.logilab.org/blogentry/88203
  http://www.logilab.org/blogentry/88219
  http://www.logilab.org/blogentry/88259

While we don't have to play catch-up with Mercurial features, I think 
something similar to what Mercurial has to warn about rewriting 
published history (amend, rebase, perhaps even filter-branch) would
be nice to have.  Perhaps even follow UI used by Mercurial, and/or
translating its implementation into git terms.

In Mercurial 2.1 there are three available phases: 'public' for
published commits, 'draft' for local un-published commits and
'secret' for local un-published commits which are not meant to
be published.

The phase of a changeset is always equal to or higher than the phase
of it's descendants, according to the following order:

      public < draft < secret

Commits start life as 'draft', and move to 'public' on push.

Mercurial documentation talks about phase of a commit, which might
be a good UI, ut also about commits in 'public' phase being "immutable".
As commits in Git are immutable, and rewriting history is in fact
re-doing commits, this description should probably be changed.

While default "push matching" behavior makes it possible to have 
"secret" commits, being able to explicitly mark commits as not for
publishing might be a good idea also for Git.

What do you think about this?
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFD] Rewriting safety - warn before/when rewriting published history
  2012-02-04 19:45 [RFD] Rewriting safety - warn before/when rewriting published history Jakub Narebski
@ 2012-02-05 14:33 ` Ben Walton
  2012-02-05 15:05   ` Jakub Narebski
       [not found] ` <CAFA910035B74E56A52A96097E76AC39@PhilipOakley>
  2012-02-06  0:57 ` Steven Michalske
  2 siblings, 1 reply; 34+ messages in thread
From: Ben Walton @ 2012-02-05 14:33 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Excerpts from Jakub Narebski's message of Sat Feb 04 14:45:53 -0500 2012:

Hi Jakub,

These items are as much about UI as anything else, I think.  UI that
better helps users to know the state of their commits and branches can
only be a good thing.  People that have used git for a while and are
comfortable with it may not see the need/point of these, but I think
they could both really help new users.

> In Mercurial 2.1 there are three available phases: 'public' for
> published commits, 'draft' for local un-published commits and
> 'secret' for local un-published commits which are not meant to be
> published.

How do you envision such a feature in git?

A 'draft' commit (or chain of commits) could be determined from the
push matching definitions and then marked with simple decorations in
log output...This would extend the ability of status to note that your
are X commits ahead of foo.  This would see any commit on a branch
that would be pushed automatically decorated with a 'draft' status.

> While default "push matching" behavior makes it possible to have
> "secret" commits, being able to explicitly mark commits as not for
> publishing might be a good idea also for Git.

Do you see using configuration or convention to achieve this?

For example, any branch named private/foo could, by convention, be
un-pushable without a force option?  Alternately, a config item
similar to the push matching stuff to allow the users to designate
un-pushable branches could work too.

Please don't take the above implementation possibilities as anything
more than a starting point for discussion as they may be deeply
flawed.  I'm just tossing a few things out there as I think this is a
good discussion to have.

Thanks
-Ben
--
Ben Walton
Systems Programmer - CHASS
University of Toronto
C:416.407.5610 | W:416.978.4302

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFD] Rewriting safety - warn before/when rewriting published history
  2012-02-05 14:33 ` Ben Walton
@ 2012-02-05 15:05   ` Jakub Narebski
  0 siblings, 0 replies; 34+ messages in thread
From: Jakub Narebski @ 2012-02-05 15:05 UTC (permalink / raw)
  To: Ben Walton; +Cc: git

On Sun, 5 Feb 2012, Ben Walton wrote:
> Excerpts from Jakub Narebski's message of Sat Feb 04 14:45:53 -0500 2012:
> 
> Hi Jakub,
> 
> These items are as much about UI as anything else, I think.  UI that
> better helps users to know the state of their commits and branches can
> only be a good thing.  People that have used git for a while and are
> comfortable with it may not see the need/point of these, but I think
> they could both really help new users.

As I said, 1500+ git users would like to have such feature, according
to latest Git User's Survey.

> > In Mercurial 2.1 there are three available phases: 'public' for
> > published commits, 'draft' for local un-published commits and
> > 'secret' for local un-published commits which are not meant to be
> > published.
> 
> How do you envision such a feature in git?
> 
> A 'draft' commit (or chain of commits) could be determined from the
> push matching definitions and then marked with simple decorations in
> log output...This would extend the ability of status to note that your
> are X commits ahead of foo.  This would see any commit on a branch
> that would be pushed automatically decorated with a 'draft' status.

I think that in its basic form (treating all remotes equally) commits
in 'public' phase would be those reachable from remote-tracking branches.
Otherwise commits would be in 'draft' phase, unless explicitly marked
as 'secret' (it we implement 'secret' phase, that is).

The safety new I think of would (similarly to Mercurial phases) prevent
or warn about amending published commit, and rebasing commits which were
already published (in 'public' phase).  That would require modifications
to git-commit and git-amend, I think...

Maybe even Git could refuse or warn on the local side about non
fast-forward update of public branch, to help users of third-party tools.

> > While default "push matching" behavior makes it possible to have
> > "secret" commits, being able to explicitly mark commits as not for
> > publishing might be a good idea also for Git.
> 
> Do you see using configuration or convention to achieve this?
> 
> For example, any branch named private/foo could, by convention, be
> un-pushable without a force option?  Alternately, a config item
> similar to the push matching stuff to allow the users to designate
> un-pushable branches could work too.

I'm not sure, but the config item might be a good solution.  Git would
skip publishing 'secret' commits (commits from 'secret' branch) if it
would otherwise publish it due to glob refspec, and refuse (or warn)
publishing 'secret' branches explicitly.

Currently if you use default "push matching", then those branches that
you didn't push explicitly wouldn't be pushed.  But that does not prevent
pushing them by accident, and does not give UI to check if branch is
private or not (e.g. to use in git-aware shell prompt).

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 34+ messages in thread

[parent not found: <CAFA910035B74E56A52A96097E76AC39@PhilipOakley>]

* Re: [RFD] Rewriting safety - warn before/when rewriting published history
       [not found] ` <CAFA910035B74E56A52A96097E76AC39@PhilipOakley>
@ 2012-02-05 16:15   ` Jakub Narebski
  2012-02-05 17:29     ` Johan Herland
  0 siblings, 1 reply; 34+ messages in thread
From: Jakub Narebski @ 2012-02-05 16:15 UTC (permalink / raw)
  To: Philip Oakley; +Cc: git

Please don't remove git mailing list from Cc... Oh, I see that you
forgot to send to list, but resend your email there.

On Sun, 5 Feb 2012, Philip Oakley wrote:
> From: "Jakub Narebski" <jnareb@gmail.com>
> Sent: Saturday, February 04, 2012 7:45 PM

> > Git includes protection against rewriting published history on the
> > receive side with fast-forward check by default (which can be
> > overridden) and various receive.deny* configuration variables,
> > including receive.denyNonFastForwards.
> >
> > Nevertheless git users requested (among others in Git User's Survey)
> > more help on creation side, namely preventing rewriting parts of
> > history which was already made public (or at least warning that one is
> > about to rewrite published history).  The "warn before/when rewriting
> > published history" answer in "17. Which of the following features would
> > you like to see implemented in git?" multiple-choice question in latest
> > Git User's Survey 2011[1] got 24% (1525) responses.
> >
> > [1]: https://www.survs.com/results/Q5CA9SKQ/P7DE07F0PL
> >
> > So people would like for git to warn them about rewriting history before
> > they attempt a push and it turns out to not fast-forward.
> 
> Another area that is implicitly related is that of (lack of) publication of 
> sub-module updates. A mechanisms that, in the super project, knows the 
> status of the (local) submodules, such as where they would be sourced from, 
> i.e. what was last pushed & where, could help in such instances.

"Better support for submodules" had almost the same number of requests
in the latest Git User's Survey 2011 (25% which means 1582 responses).
 
Remembering when to do recursive push and where would be a very nice thing.

[...]
> Recording where they were pushed to would be useful for synchronising 
> sub-modules and their super projects. That is, giving remote users a clue as 
> to where they might find mising sub-modules.

Is it a matter of correctly writing configuration with current git?
I don't use submodules myself, so I cannot say.

> > Mercurial documentation talks about phase of a commit, which might
> > be a good UI, ut also about commits in 'public' phase being "immutable".
> > As commits in Git are immutable, and rewriting history is in fact
> > re-doing commits, this description should probably be changed.
> >
> > While default "push matching" behavior makes it possible to have
> > "secret" commits, being able to explicitly mark commits as not for
> > publishing might be a good idea also for Git.
> >
> 
> Being able to mark temporary, out of sequence or other hacks as Secret could 
> be useful, as would recording where Public commits had been sent.

Marking as 'secret' must I think be explicit, but I think 'public' phase
should be inferred from remote-tracking branches.  The idea of phases is
to allow UI to ask about status of commits: can we amend / rebase it or
not, can we push it or not.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFD] Rewriting safety - warn before/when rewriting published history
  2012-02-05 16:15   ` Jakub Narebski
@ 2012-02-05 17:29     ` Johan Herland
  2012-02-05 20:46       ` Jakub Narebski
  0 siblings, 1 reply; 34+ messages in thread
From: Johan Herland @ 2012-02-05 17:29 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Philip Oakley, git

2012/2/5 Jakub Narebski <jnareb@gmail.com>:
>> Being able to mark temporary, out of sequence or other hacks as Secret could
>> be useful, as would recording where Public commits had been sent.
>
> Marking as 'secret' must I think be explicit, but I think 'public' phase
> should be inferred from remote-tracking branches.  The idea of phases is
> to allow UI to ask about status of commits: can we amend / rebase it or
> not, can we push it or not.

I agree that the 'public' state should (by default) be automatically
inferred from remote-tracking branches. As it stands, we can do this
with current git, by writing a pre-rebase hook that checks if any of
the commits to-be-rebased are reachable from any remote-tracking
branch.

Unfortunately, the pre-rebase hook only affects 'git rebase', and in
order to do the same check on 'git commit --amend' you'd have to write
a similar pre-commit hook (don't know how easy it is to find the
amended commit from within the hook). Maybe we should add a
pre-rewrite hook that trigger in the same situations as the
post-rewrite hook.

This should take care of the simplest 'public' use case in a
push-based workflow. If you publish commits by other means
(send-email, bundles, pulling directly from your repo), you need some
other way to mark the 'public' commits. One solution would be using
'git notes' to annotate the 'public' commits on a 'refs/notes/public'
notes ref. Your pre-rebase/pre-rewrite hook should then check if any
of the commits to-be-rewritten are reachable from any commit annotated
as 'public'.

Also, if you want to record where 'public' commits have been sent
(other than what can be inferred from the remote-tracking branches),
you could write this into the refs/notes/public annotation.

As for 'secret' commits, you could annotate these on a
refs/notes/secret notes ref, and then teach 'git push' (or whatever
other method for publishing commits you use) to refuse to publish
commits annotated on this notes ref. Possibly we would want to add a
"pre-push" or "pre-publish" hook.

Have fun! :)

...Johan

-- 
Johan Herland, <johan@herland.net>
www.herland.net

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFD] Rewriting safety - warn before/when rewriting published history
  2012-02-05 17:29     ` Johan Herland
@ 2012-02-05 20:46       ` Jakub Narebski
  2012-02-05 22:49         ` Johan Herland
  0 siblings, 1 reply; 34+ messages in thread
From: Jakub Narebski @ 2012-02-05 20:46 UTC (permalink / raw)
  To: Johan Herland; +Cc: Philip Oakley, git

On Sun, 5 Feb 2012, Johan Herland wrote:
> 2012/2/5 Jakub Narebski <jnareb@gmail.com>:

> > > Being able to mark temporary, out of sequence or other hacks as Secret could
> > > be useful, as would recording where Public commits had been sent.
> >
> > Marking as 'secret' must I think be explicit, but I think 'public' phase
> > should be inferred from remote-tracking branches.  The idea of phases is
> > to allow UI to ask about status of commits: can we amend / rebase it or
> > not, can we push it or not.
> 
> I agree that the 'public' state should (by default) be automatically
> inferred from remote-tracking branches. As it stands, we can do this
> with current git, by writing a pre-rebase hook that checks if any of
> the commits to-be-rebased are reachable from any remote-tracking
> branch.

It is nice that we can achieve a large part of this feature with existing
infrastructure.  It would be nice if we ship such pre-rebase hook with
git, so people can just enable it if they want to use this functionality,
like the default pre-commit hook that checks for whitespace errors.

> Unfortunately, the pre-rebase hook only affects 'git rebase', and in
> order to do the same check on 'git commit --amend' you'd have to write
> a similar pre-commit hook (don't know how easy it is to find the
> amended commit from within the hook). Maybe we should add a
> pre-rewrite hook that trigger in the same situations as the
> post-rewrite hook.

pre-rewrite hook would be a really nice to have, especially that it would
(I hope) cover third party tools like various GUIs for git; and also
git-filter-branch.

Note however that the safety net, i.e. refusing or warning against attempted
rewrite of published history is only part of issue.  Another important part
is querying and showing "phase" of a commit.  What I'd like to see is
ability to show among others in "git log" and "git show" output if commit
was already published or not (and if it is marked 'secret').

> This should take care of the simplest 'public' use case in a
> push-based workflow. If you publish commits by other means
> (send-email, bundles, pulling directly from your repo), you need some
> other way to mark the 'public' commits. One solution would be using
> 'git notes' to annotate the 'public' commits on a 'refs/notes/public'
> notes ref. Your pre-rebase/pre-rewrite hook should then check if any
> of the commits to-be-rewritten are reachable from any commit annotated
> as 'public'.

Another solution would be to create "fake" remote-tracking branches
by git-bundle and git-send-email.

> Also, if you want to record where 'public' commits have been sent
> (other than what can be inferred from the remote-tracking branches),
> you could write this into the refs/notes/public annotation.

I wonder if this too can be done by hook...

> As for 'secret' commits, you could annotate these on a
> refs/notes/secret notes ref, and then teach 'git push' (or whatever
> other method for publishing commits you use) to refuse to publish
> commits annotated on this notes ref. Possibly we would want to add a
> "pre-push" or "pre-publish" hook.

Well, addition of pre-push / pre-publish was resisted on the grounds
that all it does is something that can be as easy done by hand before
push.  Perhaps this new use case would help bring it forward, don't
you think?

Thanks for all the comments.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFD] Rewriting safety - warn before/when rewriting published history
  2012-02-05 20:46       ` Jakub Narebski
@ 2012-02-05 22:49         ` Johan Herland
  2012-02-06 14:44           ` Jakub Narebski
  0 siblings, 1 reply; 34+ messages in thread
From: Johan Herland @ 2012-02-05 22:49 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Philip Oakley, git

On Sun, Feb 5, 2012 at 21:46, Jakub Narebski <jnareb@gmail.com> wrote:
> On Sun, 5 Feb 2012, Johan Herland wrote:
>> 2012/2/5 Jakub Narebski <jnareb@gmail.com>:
>> > > Being able to mark temporary, out of sequence or other hacks as Secret could
>> > > be useful, as would recording where Public commits had been sent.
>> >
>> > Marking as 'secret' must I think be explicit, but I think 'public' phase
>> > should be inferred from remote-tracking branches.  The idea of phases is
>> > to allow UI to ask about status of commits: can we amend / rebase it or
>> > not, can we push it or not.
>>
>> I agree that the 'public' state should (by default) be automatically
>> inferred from remote-tracking branches. As it stands, we can do this
>> with current git, by writing a pre-rebase hook that checks if any of
>> the commits to-be-rebased are reachable from any remote-tracking
>> branch.
>
> It is nice that we can achieve a large part of this feature with existing
> infrastructure.  It would be nice if we ship such pre-rebase hook with
> git, so people can just enable it if they want to use this functionality,
> like the default pre-commit hook that checks for whitespace errors.

Yeah. As it is, the pre-rebase hook shipped with v1.7.9 (when
activated) does something similar (i.e. prevent rewriting 'public'
commits). However, it's highly workflow-specific, since it determines
whether the branch being rebased has been merged into "next" or
"master". IMHO, a hook that tested for reachability from
remote-tracking refs would be more generally useful. Obviously, the
two can be combined, and even further combinations may be desirable
(e.g. also checking for reachability from commits annotated in
refs/notes/public).

>> Unfortunately, the pre-rebase hook only affects 'git rebase', and in
>> order to do the same check on 'git commit --amend' you'd have to write
>> a similar pre-commit hook (don't know how easy it is to find the
>> amended commit from within the hook). Maybe we should add a
>> pre-rewrite hook that trigger in the same situations as the
>> post-rewrite hook.
>
> pre-rewrite hook would be a really nice to have, especially that it would
> (I hope) cover third party tools like various GUIs for git; and also
> git-filter-branch.
>
> Note however that the safety net, i.e. refusing or warning against attempted
> rewrite of published history is only part of issue.  Another important part
> is querying and showing "phase" of a commit.  What I'd like to see is
> ability to show among others in "git log" and "git show" output if commit
> was already published or not (and if it is marked 'secret').

Today, you can use --decorate to display remote-tracking refs in the
log/show output. However, only the tip commits are decorated, so if
the commits shown are not at the tip, you're out of luck. I believe
teaching log/show to decorate _all_ commits that are reachable from
some given ref(s) should be fairly straightforward.

If you use 'git notes' to annotate 'public' and 'secret' states, then
you can also use the --show-notes=<ref> option to let show/log display
the annotations on 'public'/'secret' commits.

>> This should take care of the simplest 'public' use case in a
>> push-based workflow. If you publish commits by other means
>> (send-email, bundles, pulling directly from your repo), you need some
>> other way to mark the 'public' commits. One solution would be using
>> 'git notes' to annotate the 'public' commits on a 'refs/notes/public'
>> notes ref. Your pre-rebase/pre-rewrite hook should then check if any
>> of the commits to-be-rewritten are reachable from any commit annotated
>> as 'public'.
>
> Another solution would be to create "fake" remote-tracking branches
> by git-bundle and git-send-email.

Good point. Creating such "fake" remote-tracking branches might be a
good idea in those workflows anyway, simply to keep track of what has
been shared, and where.

>> Also, if you want to record where 'public' commits have been sent
>> (other than what can be inferred from the remote-tracking branches),
>> you could write this into the refs/notes/public annotation.
>
> I wonder if this too can be done by hook...

You're looking for someting like a post-push hook that runs on the
_client_ after a successful push. AFAIK, that doesn't exist yet. (Not
to be confused with the receive/update hooks that run on the
_server_.)

>> As for 'secret' commits, you could annotate these on a
>> refs/notes/secret notes ref, and then teach 'git push' (or whatever
>> other method for publishing commits you use) to refuse to publish
>> commits annotated on this notes ref. Possibly we would want to add a
>> "pre-push" or "pre-publish" hook.
>
> Well, addition of pre-push / pre-publish was resisted on the grounds
> that all it does is something that can be as easy done by hand before
> push.  Perhaps this new use case would help bring it forward, don't
> you think?

Maybe. I didn't follow the original discussion. From my POV, you could
argue that instead of another hook, you could always write a script
that does the 'secret' check before invoking 'git push', and then
you'd use that script instead of 'git push'. But you could argue the
same point for pretty much all of the other existing hooks (e.g.
instead of a pre-commit hook you could have your own commit wrapper
script). So I don't think that's a sufficient argument to refuse the
existence of a pre-push/publish hook.

Have fun! :)

...Johan

-- 
Johan Herland, <johan@herland.net>
www.herland.net

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFD] Rewriting safety - warn before/when rewriting published history
  2012-02-05 22:49         ` Johan Herland
@ 2012-02-06 14:44           ` Jakub Narebski
  2012-02-06 15:59             ` Johan Herland
  0 siblings, 1 reply; 34+ messages in thread
From: Jakub Narebski @ 2012-02-06 14:44 UTC (permalink / raw)
  To: Johan Herland; +Cc: Philip Oakley, git

On Sun, 5 Feb 2012, Johan Herland wrote:
> On Sun, Feb 5, 2012 at 21:46, Jakub Narebski <jnareb@gmail.com> wrote:
>> On Sun, 5 Feb 2012, Johan Herland wrote:
>>> 2012/2/5 Jakub Narebski <jnareb@gmail.com>:

[...]
>>> I agree that the 'public' state should (by default) be automatically
>>> inferred from remote-tracking branches. As it stands, we can do this
>>> with current git, by writing a pre-rebase hook that checks if any of
>>> the commits to-be-rebased are reachable from any remote-tracking
>>> branch.
>>
>> It is nice that we can achieve a large part of this feature with existing
>> infrastructure.  It would be nice if we ship such pre-rebase hook with
>> git, so people can just enable it if they want to use this functionality,
>> like the default pre-commit hook that checks for whitespace errors.
> 
> Yeah. As it is, the pre-rebase hook shipped with v1.7.9 (when
> activated) does something similar (i.e. prevent rewriting 'public'
> commits). However, it's highly workflow-specific, since it determines
> whether the branch being rebased has been merged into "next" or
> "master". IMHO, a hook that tested for reachability from
> remote-tracking refs would be more generally useful. Obviously, the
> two can be combined, and even further combinations may be desirable
> (e.g. also checking for reachability from commits annotated in
> refs/notes/public).

Relying on (default) hooks to implement this feature has the disadvantage
that it wouldn't be turned on by default... while this feature would be
most helpful for users new to git (scared by refuse to push).
 
I am not sure either if everything (wrt. safety net) can be implemented
via hooks.  One thing that I forgot about is preventing rewinding of
branch past the published commit using e.g. "git reset --hard <commit>".
Unless `pre-rewrite` hook could be used for that safety too...

[...]
>> Note however that the safety net, i.e. refusing or warning against attempted
>> rewrite of published history is only part of issue.  Another important part
>> is querying and showing "phase" of a commit.  What I'd like to see is
>> ability to show among others in "git log" and "git show" output if commit
>> was already published or not (and if it is marked 'secret').
> 
> Today, you can use --decorate to display remote-tracking refs in the
> log/show output. However, only the tip commits are decorated, so if
> the commits shown are not at the tip, you're out of luck. I believe
> teaching log/show to decorate _all_ commits that are reachable from
> some given ref(s) should be fairly straightforward.

That would be nice.
 
> If you use 'git notes' to annotate 'public' and 'secret' states, then
> you can also use the --show-notes=<ref> option to let show/log display
> the annotations on 'public'/'secret' commits.

First, in my opinion annotating _all_ commits with their phase is I think
out of question, especially annotating 'public' commits.  I don't think
git-notes mechanism would scale well to annotating every commit; but
perhaps this was tested to work, and I am mistaken. 
 
Second, I have doubts if "phase" is really state of an individual commit,
and not the feature of revision walking.

Take for example the situation where given commit is reference by 
remote-tracking branch 'public/foo', and also by two local branches:
'foo' with upstream 'public/foo', and local branch 'bar' with no upstream.

Now it is quite obvious that this feature should prevent rewriting 'foo'
branch, for which commits are published upstream.  But what about branch
'bar'?  Should we prevent rewriting (e.g. rebase) here too?  What about
rewinding 'bar' to point somewhere else.  What if 'bar' is really detached
HEAD? 

These questions need to be answered...

[...]
>>> Also, if you want to record where 'public' commits have been sent
>>> (other than what can be inferred from the remote-tracking branches),
>>> you could write this into the refs/notes/public annotation.
>>
>> I wonder if this too can be done by hook...
> 
> You're looking for someting like a post-push hook that runs on the
> _client_ after a successful push. AFAIK, that doesn't exist yet. (Not
> to be confused with the receive/update hooks that run on the
> _server_.)

And such hook could react to what was successfully pushed.  Without
such hook we would have to write wrapper around git-push and parse
its output...
 
Nb. such hook could create "fake" remote-tracking branches if given
push-only remote doesn't have them set up.

>>> As for 'secret' commits, you could annotate these on a
>>> refs/notes/secret notes ref, and then teach 'git push' (or whatever
>>> other method for publishing commits you use) to refuse to publish
>>> commits annotated on this notes ref. Possibly we would want to add a
>>> "pre-push" or "pre-publish" hook.
>>
>> Well, addition of pre-push / pre-publish was resisted on the grounds
>> that all it does is something that can be as easy done by hand before
>> push.  Perhaps this new use case would help bring it forward, don't
>> you think?
> 
> Maybe. I didn't follow the original discussion. From my POV, you could
> argue that instead of another hook, you could always write a script
> that does the 'secret' check before invoking 'git push', and then
> you'd use that script instead of 'git push'. But you could argue the
> same point for pretty much all of the other existing hooks (e.g.
> instead of a pre-commit hook you could have your own commit wrapper
> script). So I don't think that's a sufficient argument to refuse the
> existence of a pre-push/publish hook.

Right, This would be for this feature very much like pre-commit hook.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFD] Rewriting safety - warn before/when rewriting published history
  2012-02-06 14:44           ` Jakub Narebski
@ 2012-02-06 15:59             ` Johan Herland
  2012-02-06 17:14               ` Jakub Narebski
  0 siblings, 1 reply; 34+ messages in thread
From: Johan Herland @ 2012-02-06 15:59 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Philip Oakley, git

On Mon, Feb 6, 2012 at 15:44, Jakub Narebski <jnareb@gmail.com> wrote:
> On Sun, 5 Feb 2012, Johan Herland wrote:
> Relying on (default) hooks to implement this feature has the disadvantage
> that it wouldn't be turned on by default... while this feature would be
> most helpful for users new to git (scared by refuse to push).

True. I too believe that this will be most helpful if it is enabled by
default. That said, the easiest way to get there might be through
first demonstrating that it works in practice when implemented as
hooks.

> I am not sure either if everything (wrt. safety net) can be implemented
> via hooks.  One thing that I forgot about is preventing rewinding of
> branch past the published commit using e.g. "git reset --hard <commit>".
> Unless `pre-rewrite` hook could be used for that safety too...

Hmm. I don't think we'll be able to "plug" all the holes that might
leave the user in a rewritten state (e.g. what if the user (possibly
with the help of some tool) does an "echo $SHA1 >
.git/refs/head/master"?). And trying to plug too many holes might end
up annoying more experienced users who "know what they're doing".
Instead we might want to add a client-side check at push time. I
realize that this check is already done by the remote end, but the
client-side might be able to give a more helpful response along the
lines of:

  You are trying to push branch X to remote Y, but remote Y already
has a branch X that is N commits in front of you. You may want to
rebase your work on top of the remote branch (see 'git pull
--rebase'), If you instead force this push (with --force), you will
remove those N commits, and replace them with the M last commits from
your branch X.

  (followed by a list of the remote N and local M commits, respectively)

[...]

>> If you use 'git notes' to annotate 'public' and 'secret' states, then
>> you can also use the --show-notes=<ref> option to let show/log display
>> the annotations on 'public'/'secret' commits.
>
> First, in my opinion annotating _all_ commits with their phase is I think
> out of question, especially annotating 'public' commits.  I don't think
> git-notes mechanism would scale well to annotating every commit; but
> perhaps this was tested to work, and I am mistaken.

First, we don't need to annotate _all_ commits. For the 'public'
state, we only annotate the last/tip commit that was pushed/published.
From there, we can defer that all ancestor commits are also 'public'.
For the 'secret' state, we do indeed annotate _all_ secret commits,
but I believe this will be a somewhat limited number of commits. If
your workflow forces you to annotate millions of commits as 'secret',
I claim there is something wrong with your workflow.

Second, git-notes were indeed designed scale well to handle a large
number of notes, up to the same order of magnitude as the number of
commits in your repo. (When git-notes was originally written, I
successfully tested it on versions of a linux-kernel repo where every
single commit was annotated). In this case, the number of 'public'
annotations in your repo would be equal to the number of pushes you
do, and the number of 'secret' annotations would be equal to the
number of 'secret' commits in your repo. I'd expect both of these
numbers to be orders of magnitude smaller than the total number of
commits in your repo (given a fairly typical workflow in a fairly
typical repo).

> Second, I have doubts if "phase" is really state of an individual commit,
> and not the feature of revision walking.

I believe the 'public' state is a "feature of revision walking" (i.e.
one annotated 'public' commit implies that all its ancestors are also
'public'). However, the 'secret' state should be bound to the
individual commit, IMHO.

> Take for example the situation where given commit is reference by
> remote-tracking branch 'public/foo', and also by two local branches:
> 'foo' with upstream 'public/foo', and local branch 'bar' with no upstream.
>
> Now it is quite obvious that this feature should prevent rewriting 'foo'
> branch, for which commits are published upstream.  But what about branch
> 'bar'?  Should we prevent rewriting (e.g. rebase) here too?  What about
> rewinding 'bar' to point somewhere else.  What if 'bar' is really detached
> HEAD?
>
> These questions need to be answered...

Good point. There are two questions we may need to answer: "Has commit
X ever been published?", and "Has commit X ever been published in the
context of branch Y?". In the latter case, we do indeed need to take
the upstream branch into account.

Basically, there are three different "levels" for this rewrite/publish
protection to run at:

1. Do not meddle at all. This is the current behavior, and assumes
that if the user rewrites and pushes something, the user knows what
he/she is doing, and Git should not meddle (obviously unless the
server refuses the push).

2. Warn/refuse rewriting commits in your upstream. This would only
check branch X against its registered upstream. Only if there is a
registered upstream, and you're about to rewrite commits that are
reachable from the upstream remote-tracking branch, should Git
intervene and warn/refuse the rewrite. This level would IMHO provide
most of the benefit, and little or no trouble (i.e. false positives).

3. Warn/refuse rewriting _any_ 'public' commit. Refuse to rewrite any
commit that is reachable from any remote-tracking branch. Some would
say that this is a Good Thing(tm), since it prevents a commit from
being _copied_ (i.e. rebased or cherry-picked) between branches (you'd
be in this camp if you run a tightly-controlled workflow, where you
e.g. mandate upmerging patches from the oldest applicable branch
instead of cherry-picking patches from a newer branch). However, other
people would say that this is too limiting, and imposes unnecessary
rules on the workflow of the project (where e.g. copying (by way of
git-rebase) a topic branch from one place to another would cause an
annoying false positive).

[...]

Have fun! :)

...Johan

-- 
Johan Herland, <johan@herland.net>
www.herland.net

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFD] Rewriting safety - warn before/when rewriting published history
  2012-02-06 15:59             ` Johan Herland
@ 2012-02-06 17:14               ` Jakub Narebski
  2012-02-06 20:16                 ` Johan Herland
  0 siblings, 1 reply; 34+ messages in thread
From: Jakub Narebski @ 2012-02-06 17:14 UTC (permalink / raw)
  To: Johan Herland; +Cc: Philip Oakley, git

On Mon, 6 Feb 2012, Johan Herland wrote:
> On Mon, Feb 6, 2012 at 15:44, Jakub Narebski <jnareb@gmail.com> wrote:
> > On Sun, 5 Feb 2012, Johan Herland wrote:

> > Relying on (default) hooks to implement this feature has the disadvantage
> > that it wouldn't be turned on by default... while this feature would be
> > most helpful for users new to git (scared by refuse to push).
> 
> True. I too believe that this will be most helpful if it is enabled by
> default. That said, the easiest way to get there might be through
> first demonstrating that it works in practice when implemented as
> hooks.

Yes, starting with prototype implementation using existing infrastructure
(hooks) would be a very good idea.  (That's how first versions of what
became submodules were implemented.)

OTOH we should be aware of limitations of said prototype due to the fact
that it is a prototype...
 
> > I am not sure either if everything (wrt. safety net) can be implemented
> > via hooks.  One thing that I forgot about is preventing rewinding of
> > branch past the published commit using e.g. "git reset --hard <commit>".
> > Unless `pre-rewrite` hook could be used for that safety too...
> 
> Hmm. I don't think we'll be able to "plug" all the holes that might
> leave the user in a rewritten state (e.g. what if the user (possibly
> with the help of some tool) does an "echo $SHA1 >
> .git/refs/head/master"?).

First, I was thinking about having safety net against rewriting published
commits being present only in porcelain.  Plumbing would be not affected
(perhaps there would be need to extend or add new plumbing to query "phase"
state, though).

> And trying to plug too many holes might end 
> up annoying more experienced users who "know what they're doing".

Second, forcing via command line parameter should always be an option.

> Instead we might want to add a client-side check at push time. I
> realize that this check is already done by the remote end, but the
> client-side might be able to give a more helpful response along the
> lines of:

[...]

Explanation is good, but the whole idea of rewriting safety is that you
are informed (warned or denied) _before_ attempting rewrite and doing much
work.

> > > If you use 'git notes' to annotate 'public' and 'secret' states, then
> > > you can also use the --show-notes=<ref> option to let show/log display
> > > the annotations on 'public'/'secret' commits.
> >
> > First, in my opinion annotating _all_ commits with their phase is I think
> > out of question, especially annotating 'public' commits.  I don't think
> > git-notes mechanism would scale well to annotating every commit; but
> > perhaps this was tested to work, and I am mistaken.
> 
> First, we don't need to annotate _all_ commits. For the 'public'
> state, we only annotate the last/tip commit that was pushed/published.
> From there, we can defer that all ancestor commits are also 'public'.

Right.

> For the 'secret' state, we do indeed annotate _all_ secret commits,
> but I believe this will be a somewhat limited number of commits. If
> your workflow forces you to annotate millions of commits as 'secret',
> I claim there is something wrong with your workflow.

Well, for the 'secret' we can rely on the fact that child of 'secret'
commit must also be 'secret' (non-publishable) if secret is to stay
secret.  Still marking all 'secret' commits might be better idea from
UI and from performance point of view.

> Second, git-notes were indeed designed scale well to handle a large
> number of notes, up to the same order of magnitude as the number of
> commits in your repo. (When git-notes was originally written, I
> successfully tested it on versions of a linux-kernel repo where every
> single commit was annotated).

Ah.  That is very nice!

> In this case, the number of 'public' 
> annotations in your repo would be equal to the number of pushes you
> do, and the number of 'secret' annotations would be equal to the
> number of 'secret' commits in your repo. I'd expect both of these
> numbers to be orders of magnitude smaller than the total number of
> commits in your repo (given a fairly typical workflow in a fairly
> typical repo).

Right.
 
> > Second, I have doubts if "phase" is really state of an individual commit,
> > and not the feature of revision walking.

It matters to presentation: can commit be simultaneously 'public' because
of one branch, and 'draft' because of other.

> I believe the 'public' state is a "feature of revision walking" (i.e.
> one annotated 'public' commit implies that all its ancestors are also
> 'public'). However, the 'secret' state should be bound to the
> individual commit, IMHO.

Good call, otherwise 'secret' commit could have been "side-leaked"
by other refs being pushed.

This means though that 'public' / 'draft' while looking similar to 'secret'
are in fact a bit different things.  In other words 'immutable' and
'impushable' traits are quite a bit different in behavior...

Especially that one acts at pre-rewrite time, and second pre-push time.

> > Take for example the situation where given commit is reference by
> > remote-tracking branch 'public/foo', and also by two local branches:
> > 'foo' with upstream 'public/foo', and local branch 'bar' with no upstream.
> >
> > Now it is quite obvious that this feature should prevent rewriting 'foo'
> > branch, for which commits are published upstream.  But what about branch
> > 'bar'?  Should we prevent rewriting (e.g. rebase) here too?  What about
> > rewinding 'bar' to point somewhere else.  What if 'bar' is really detached
> > HEAD?
> >
> > These questions need to be answered...
> 
> Good point. There are two questions we may need to answer: "Has commit
> X ever been published?", and "Has commit X ever been published in the
> context of branch Y?". In the latter case, we do indeed need to take
> the upstream branch into account.

I think the second one is more interested for rewrite safeties.

> Basically, there are three different "levels" for this rewrite/publish
> protection to run at:
> 
> 1. Do not meddle at all. This is the current behavior, and assumes
> that if the user rewrites and pushes something, the user knows what
> he/she is doing, and Git should not meddle (obviously unless the
> server refuses the push).

I think that there should be some easy way to force such behavior,
i.e. to discard rewrite safeties.

> 2. Warn/refuse rewriting commits in your upstream. This would only
> check branch X against its registered upstream. Only if there is a
> registered upstream, and you're about to rewrite commits that are
> reachable from the upstream remote-tracking branch, should Git
> intervene and warn/refuse the rewrite. This level would IMHO provide
> most of the benefit, and little or no trouble (i.e. false positives).

Right.  I wonder if we can get usage statistics from Mercurial users
about usage of their "phases" feature... though mapping terminology
for example 'upstream' from Git to Mercurial and vice versa can be
a pain, I guess.

> 3. Warn/refuse rewriting _any_ 'public' commit. Refuse to rewrite any
> commit that is reachable from any remote-tracking branch. Some would
> say that this is a Good Thing(tm), since it prevents a commit from
> being _copied_ (i.e. rebased or cherry-picked) between branches (you'd
> be in this camp if you run a tightly-controlled workflow, where you
> e.g. mandate upmerging patches from the oldest applicable branch
> instead of cherry-picking patches from a newer branch). However, other
> people would say that this is too limiting, and imposes unnecessary
> rules on the workflow of the project (where e.g. copying (by way of
> git-rebase) a topic branch from one place to another would cause an
> annoying false positive).

Well, we could always 'deny' on 2nd, and just 'warn' on 3rd...

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFD] Rewriting safety - warn before/when rewriting published history
  2012-02-06 17:14               ` Jakub Narebski
@ 2012-02-06 20:16                 ` Johan Herland
  2012-02-07 14:31                   ` Jakub Narebski
  0 siblings, 1 reply; 34+ messages in thread
From: Johan Herland @ 2012-02-06 20:16 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Philip Oakley, git

On Mon, Feb 6, 2012 at 18:14, Jakub Narebski <jnareb@gmail.com> wrote:
> On Mon, 6 Feb 2012, Johan Herland wrote:
>> On Mon, Feb 6, 2012 at 15:44, Jakub Narebski <jnareb@gmail.com> wrote:
>> > Relying on (default) hooks to implement this feature has the disadvantage
>> > that it wouldn't be turned on by default... while this feature would be
>> > most helpful for users new to git (scared by refuse to push).
>>
>> True. I too believe that this will be most helpful if it is enabled by
>> default. That said, the easiest way to get there might be through
>> first demonstrating that it works in practice when implemented as
>> hooks.
>
> Yes, starting with prototype implementation using existing infrastructure
> (hooks) would be a very good idea.  (That's how first versions of what
> became submodules were implemented.)
>
> OTOH we should be aware of limitations of said prototype due to the fact
> that it is a prototype...

Agreed, but AFAICS (and modulo the addition of pre-rewrite and
pre/post-push hooks mentioned earlier) all of the things discussed so
far in this thread can be implemented as hooks.

>> > I am not sure either if everything (wrt. safety net) can be implemented
>> > via hooks.  One thing that I forgot about is preventing rewinding of
>> > branch past the published commit using e.g. "git reset --hard <commit>".
>> > Unless `pre-rewrite` hook could be used for that safety too...
>>
>> Hmm. I don't think we'll be able to "plug" all the holes that might
>> leave the user in a rewritten state (e.g. what if the user (possibly
>> with the help of some tool) does an "echo $SHA1 >
>> .git/refs/head/master"?).
>
> First, I was thinking about having safety net against rewriting published
> commits being present only in porcelain.  Plumbing would be not affected
> (perhaps there would be need to extend or add new plumbing to query "phase"
> state, though).

I also think this is very much a porcelain-only feature. I'd be more
worried if plumbing changes were needed.

>> And trying to plug too many holes might end
>> up annoying more experienced users who "know what they're doing".
>
> Second, forcing via command line parameter should always be an option.

Obviously, but if a large portion of the Git community felt the need
to always disable this feature, I'd say that we'd failed. The best
features are those that Just Work(tm).

>> Instead we might want to add a client-side check at push time. I
>> realize that this check is already done by the remote end, but the
>> client-side might be able to give a more helpful response along the
>> lines of:
>
> [...]
>
> Explanation is good, but the whole idea of rewriting safety is that you
> are informed (warned or denied) _before_ attempting rewrite and doing much
> work.

True, but there may be cases where the rewrite is not apparent until
after it has happened. E.g. a novice user may use 'git reset --hard'
in order to get to an earlier state for testing purposes, and then -
after completing the test - 'git reset --hard' back to the starting
point. I know this is not best practice, but is it bad enough that we
want to refuse it?

>> First, we don't need to annotate _all_ commits. For the 'public'
>> state, we only annotate the last/tip commit that was pushed/published.
>> From there, we can defer that all ancestor commits are also 'public'.
>
> Right.
>
>> For the 'secret' state, we do indeed annotate _all_ secret commits,
>> but I believe this will be a somewhat limited number of commits. If
>> your workflow forces you to annotate millions of commits as 'secret',
>> I claim there is something wrong with your workflow.
>
> Well, for the 'secret' we can rely on the fact that child of 'secret'
> commit must also be 'secret' (non-publishable) if secret is to stay
> secret.  Still marking all 'secret' commits might be better idea from
> UI and from performance point of view.

I don't think we should automatically assume that all children of a
'secret' commit are also 'secret'. First of all, the git DAG was not
made for iterating forwards in history, so given a 'secret' commit, it
is computationally expensive to enumerate all its implied-'secret'
descendants. More importantly though, I don't agree with the premise.
I would typically use the 'secret' state as follows: While debugging a
piece of code, I might commit a few debug print statements, and I
would typically mark this debug commit as 'secret', in order to
prevent myself from accidentally pushing this. Although it probably
doesn't matter in practice, I think it is wrong for the commits made
(temporarily) on top of this debug commit to be also considered
'secret'. They are only unpublishable as a consequence of being based
on the debug commit, and only until I get around to rebasing away the
'secret' debug commit.

>> > Second, I have doubts if "phase" is really state of an individual commit,
>> > and not the feature of revision walking.
>
> It matters to presentation: can commit be simultaneously 'public' because
> of one branch, and 'draft' because of other.
>
>> I believe the 'public' state is a "feature of revision walking" (i.e.
>> one annotated 'public' commit implies that all its ancestors are also
>> 'public'). However, the 'secret' state should be bound to the
>> individual commit, IMHO.
>
> Good call, otherwise 'secret' commit could have been "side-leaked"
> by other refs being pushed.
>
> This means though that 'public' / 'draft' while looking similar to 'secret'
> are in fact a bit different things.  In other words 'immutable' and
> 'impushable' traits are quite a bit different in behavior...
>
> Especially that one acts at pre-rewrite time, and second pre-push time.

Exactly. I find Mercurial 'phase' language confusing, precisely for
the reason that 'public' and 'secret' are DIFFERENT concepts. One
hinders rewrite and naturally applies to a commit AND its ancestors,
while the other hinders push and only applies to the commit itself.
The fact that they could be implemented by the same mechanisms (hooks
and notes) does not make them the same thing.

>> > Take for example the situation where given commit is reference by
>> > remote-tracking branch 'public/foo', and also by two local branches:
>> > 'foo' with upstream 'public/foo', and local branch 'bar' with no upstream.
>> >
>> > Now it is quite obvious that this feature should prevent rewriting 'foo'
>> > branch, for which commits are published upstream.  But what about branch
>> > 'bar'?  Should we prevent rewriting (e.g. rebase) here too?  What about
>> > rewinding 'bar' to point somewhere else.  What if 'bar' is really detached
>> > HEAD?
>> >
>> > These questions need to be answered...
>>
>> Good point. There are two questions we may need to answer: "Has commit
>> X ever been published?", and "Has commit X ever been published in the
>> context of branch Y?". In the latter case, we do indeed need to take
>> the upstream branch into account.
>
> I think the second one is more interested for rewrite safeties.
>
>> Basically, there are three different "levels" for this rewrite/publish
>> protection to run at:
>>
>> 1. Do not meddle at all. This is the current behavior, and assumes
>> that if the user rewrites and pushes something, the user knows what
>> he/she is doing, and Git should not meddle (obviously unless the
>> server refuses the push).
>
> I think that there should be some easy way to force such behavior,
> i.e. to discard rewrite safeties.

Indeed. We should probably have a simple config flag to enable the
rewrite protection. In fact I would argue that the flag should default
to false (disable protection) when unset, and then we should let
init/clone set the config flag to true (enable protection) in newly
created repos (unless explicitly disabled in the user/system configs).
This way, behavior does not change for existing repos, but new repos
are protected by default (with only a single command needed to disable
the protection).

>> 2. Warn/refuse rewriting commits in your upstream. This would only
>> check branch X against its registered upstream. Only if there is a
>> registered upstream, and you're about to rewrite commits that are
>> reachable from the upstream remote-tracking branch, should Git
>> intervene and warn/refuse the rewrite. This level would IMHO provide
>> most of the benefit, and little or no trouble (i.e. false positives).
>
> Right.  I wonder if we can get usage statistics from Mercurial users
> about usage of their "phases" feature... though mapping terminology
> for example 'upstream' from Git to Mercurial and vice versa can be
> a pain, I guess.

I'm unsure how useful it would be. IMHO Git and Mercurial are
different enough (and promote sightly different workflows) that I
don't trust the the average Mercurial user's preference for
Mercurial's 'phase' behavior to be transferable to the average Git
user's preference for a similar behavior in Git.

>> 3. Warn/refuse rewriting _any_ 'public' commit. Refuse to rewrite any
>> commit that is reachable from any remote-tracking branch. Some would
>> say that this is a Good Thing(tm), since it prevents a commit from
>> being _copied_ (i.e. rebased or cherry-picked) between branches (you'd
>> be in this camp if you run a tightly-controlled workflow, where you
>> e.g. mandate upmerging patches from the oldest applicable branch
>> instead of cherry-picking patches from a newer branch). However, other
>> people would say that this is too limiting, and imposes unnecessary
>> rules on the workflow of the project (where e.g. copying (by way of
>> git-rebase) a topic branch from one place to another would cause an
>> annoying false positive).
>
> Well, we could always 'deny' on 2nd, and just 'warn' on 3rd...

Good idea.


Have fun! :)

...Johan

-- 
Johan Herland, <johan@herland.net>
www.herland.net

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFD] Rewriting safety - warn before/when rewriting published history
  2012-02-06 20:16                 ` Johan Herland
@ 2012-02-07 14:31                   ` Jakub Narebski
  2012-02-07 15:09                     ` Johan Herland
  2012-02-07 17:27                     ` Ronan Keryell
  0 siblings, 2 replies; 34+ messages in thread
From: Jakub Narebski @ 2012-02-07 14:31 UTC (permalink / raw)
  To: Johan Herland; +Cc: Philip Oakley, git

On Mon, 6 Feb 2012, Johan Herland wrote:
> On Mon, Feb 6, 2012 at 18:14, Jakub Narebski <jnareb@gmail.com> wrote:
>> On Mon, 6 Feb 2012, Johan Herland wrote:
>>> On Mon, Feb 6, 2012 at 15:44, Jakub Narebski <jnareb@gmail.com> wrote:

[...]
>> Yes, starting with prototype implementation using existing infrastructure
>> (hooks) would be a very good idea.  (That's how first versions of what
>> became submodules were implemented.)
>>
>> OTOH we should be aware of limitations of said prototype due to the fact
>> that it is a prototype...
> 
> Agreed, but AFAICS (and modulo the addition of pre-rewrite and
> pre/post-push hooks mentioned earlier) all of the things discussed so
> far in this thread can be implemented as hooks.

That would be nice.

And the new hooks (pre-rewrite, pre/post-push) would be useful anyway,
I think.

[...]
>>> And trying to plug too many holes might end
>>> up annoying more experienced users who "know what they're doing".
>>
>> Second, forcing via command line parameter should always be an option.
> 
> Obviously, but if a large portion of the Git community felt the need
> to always disable this feature, I'd say that we'd failed. The best
> features are those that Just Work(tm).

Well, we have some features in git like advice.* that cater to newbies.
This could be such newbie-mostly feature too.

Anyway what I was thinking about is that forcing is for those hopefully
rare cases where Git safety net is too strict, where DWIM-mery fails.

>>> Instead we might want to add a client-side check at push time. I
>>> realize that this check is already done by the remote end, but the
>>> client-side might be able to give a more helpful response along the
>>> lines of:
>>
>> [...]
>>
>> Explanation is good, but the whole idea of rewriting safety is that you
>> are informed (warned or denied) _before_ attempting rewrite and doing much
>> work.
> 
> True, but there may be cases where the rewrite is not apparent until
> after it has happened. E.g. a novice user may use 'git reset --hard'
> in order to get to an earlier state for testing purposes, and then -
> after completing the test - 'git reset --hard' back to the starting
> point. I know this is not best practice, but is it bad enough that we
> want to refuse it?

I'd like to have safety net also for using 'git reset --hard' to rewind
publishable branch past published history.  Though preventing e.g. force
rename of other branch to publishable branch so it doesn't fast-forward
WRT its upstream would be going to far, I guess...

>>> First, we don't need to annotate _all_ commits. For the 'public'
>>> state, we only annotate the last/tip commit that was pushed/published.
>>> From there, we can defer that all ancestor commits are also 'public'.
>>
>> Right.
>>
>>> For the 'secret' state, we do indeed annotate _all_ secret commits,
>>> but I believe this will be a somewhat limited number of commits. If
>>> your workflow forces you to annotate millions of commits as 'secret',
>>> I claim there is something wrong with your workflow.
>>
>> Well, for the 'secret' we can rely on the fact that child of 'secret'
>> commit must also be 'secret' (non-publishable) if secret is to stay
>> secret.  Still marking all 'secret' commits might be better idea from
>> UI and from performance point of view.
> 
> I don't think we should automatically assume that all children of a
> 'secret' commit are also 'secret'.

All right.  What is important is that safety net related to 'secret'
commits would prevent publishing such commits even if you build non-secret
commits on top of 'secret' one.

>                                 First of all, the git DAG was not 
> made for iterating forwards in history, so given a 'secret' commit, it
> is computationally expensive to enumerate all its implied-'secret'
> descendants.

Right, this would enormously complicate detecting 'secret'-ness of commit
when browsing history / querying "phase" of a commit..

> More importantly though, I don't agree with the premise. 
> I would typically use the 'secret' state as follows: While debugging a
> piece of code, I might commit a few debug print statements, and I
> would typically mark this debug commit as 'secret', in order to
> prevent myself from accidentally pushing this. Although it probably
> doesn't matter in practice, I think it is wrong for the commits made
> (temporarily) on top of this debug commit to be also considered
> 'secret'. They are only unpublishable as a consequence of being based
> on the debug commit, and only until I get around to rebasing away the
> 'secret' debug commit.

Right, thanks for sanity check.

So another difference that in its full complexity the 'secret' is an
attribute of an individual commit which is (usually) local to repository,
while 'public' is in its full complexity an attribute of revision walking
rather than something that can be attached to a commit.

Also, when thinking about different scenarios of why one would like to
mark commit as 'secret', we might want to be able to mark commit as
secret / unpublishable with respect to _subset_ of remotes, so e.g.
I am prevented from accidentally publishing commits marked as 'secret'
to public repository, or to CI/QA repository, but I can push (perhaps
with warning) to group repository, together with 'secret'-ness state
of said commit... 

... though it wouldn't be as much 'secret' as 'confidential' ;-)

>>>> Second, I have doubts if "phase" is really state of an individual commit,
>>>> and not the feature of revision walking.
>>
>> It matters to presentation: can commit be simultaneously 'public' because
>> of one branch, and 'draft' because of other.
>>
>>> I believe the 'public' state is a "feature of revision walking" (i.e.
>>> one annotated 'public' commit implies that all its ancestors are also
>>> 'public'). However, the 'secret' state should be bound to the
>>> individual commit, IMHO.
>>
>> Good call, otherwise 'secret' commit could have been "side-leaked"
>> by other refs being pushed.
>>
>> This means though that 'public' / 'draft' while looking similar to 'secret'
>> are in fact a bit different things.  In other words 'immutable' and
>> 'impushable' traits are quite a bit different in behavior...
>>
>> Especially that one acts at pre-rewrite time, and second pre-push time.
> 
> Exactly. I find Mercurial 'phase' language confusing, precisely for
> the reason that 'public' and 'secret' are DIFFERENT concepts. One
> hinders rewrite and naturally applies to a commit AND its ancestors,
> while the other hinders push and only applies to the commit itself.
> The fact that they could be implemented by the same mechanisms (hooks
> and notes) does not make them the same thing.

Well, if one doesn't think about all possible usages and complications,
the simplicity of metaphor of "phases" is quite compelling.  We have
(at first glance)

   public < draft < secret

states of a commit (changeset), just like we have

   solid < liquid < gaseous

phases of matter.  

If we treat traits of 'immutability' and 'publishability' as true boolean,
such ordering of "phases" looks natural.  Just like matter goes from gas
to liquid to solid with lowering temperature, changeset "phase" goes from
'secret' to 'draft' to 'public' following ancestry chain.

But if we go for expressivity and power like the rest of Git, rather than
for simplicity like Mercurial does, the differences between 'secret'-ness
and 'public'-ness make it unlikely that it would be a good idea to encompass
those in a single UI and a single concept.  I guess that it can be another
issue where Mercurial oversimplifies its approach (c.f. branches which
started from clone to branch and anonymous branches, to "named branches"
which are really commit labels, to local-only bookmark branches, to 
transferrable Git-like bookmark branches but with single namespace, and
only just considering equivalent of Git's refspec and refs mapping; c.f.
revision ranges based on local numbering of commits which made ranges
local too, to new Git-like topological `--ancestry-path`-like ranges). 

Also with per-remote 'secret'-ness, we might have the scenario where
a commit is marked as 'secret' for public repo, we have pushed it to
group repo (and it's 'secret'-ness trait together with it), an now we
want to be warned if we try to rewrite said commit.

So, separate traits it is (of revision walk, or of a commit), rather
than "smush-together" antity / attribute of "phase" of revision.

That is BTW why I wanted us to have this discussion... :-D

[...]
>>> Basically, there are three different "levels" for this rewrite/publish
>>> protection to run at:
>>>
>>> 1. Do not meddle at all. This is the current behavior, and assumes
>>> that if the user rewrites and pushes something, the user knows what
>>> he/she is doing, and Git should not meddle (obviously unless the
>>> server refuses the push).
>>
>> I think that there should be some easy way to force such behavior,
>> i.e. to discard rewrite safeties.
> 
> Indeed. We should probably have a simple config flag to enable the
> rewrite protection. In fact I would argue that the flag should default
> to false (disable protection) when unset, and then we should let
> init/clone set the config flag to true (enable protection) in newly
> created repos (unless explicitly disabled in the user/system configs).
> This way, behavior does not change for existing repos, but new repos
> are protected by default (with only a single command needed to disable
> the protection).

And commit option for discarding safeties for specific push, or for
specific rewrite (e.g. pushed at night, noticed an error in commit
message, force-amended, and force-pushed again, hoping that the window
of opportunity is sufficintly small so that nobody will be affected).

>>> 2. Warn/refuse rewriting commits in your upstream. This would only
>>> check branch X against its registered upstream. Only if there is a
>>> registered upstream, and you're about to rewrite commits that are
>>> reachable from the upstream remote-tracking branch, should Git
>>> intervene and warn/refuse the rewrite. This level would IMHO provide
>>> most of the benefit, and little or no trouble (i.e. false positives).
>>
>> Right.  I wonder if we can get usage statistics from Mercurial users
>> about usage of their "phases" feature... though mapping terminology
>> for example 'upstream' from Git to Mercurial and vice versa can be
>> a pain, I guess.
> 
> I'm unsure how useful it would be. IMHO Git and Mercurial are
> different enough (and promote sightly different workflows) that I
> don't trust the the average Mercurial user's preference for
> Mercurial's 'phase' behavior to be transferable to the average Git
> user's preference for a similar behavior in Git.

Well, bad statistics might be better than no statistics.  What would
be especially interesting is _complaints_ about "phases" feature,
don't you think, i.e. where the "phases" metaphor fails.

[...]

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFD] Rewriting safety - warn before/when rewriting published history
  2012-02-07 14:31                   ` Jakub Narebski
@ 2012-02-07 15:09                     ` Johan Herland
  2012-02-10 19:38                       ` Jakub Narebski
  2012-04-07 14:49                       ` Steven Michalske
  2012-02-07 17:27                     ` Ronan Keryell
  1 sibling, 2 replies; 34+ messages in thread
From: Johan Herland @ 2012-02-07 15:09 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Philip Oakley, git

(we are pretty much in violent agreement, so I will only comment where
I find it necessary)

On Tue, Feb 7, 2012 at 15:31, Jakub Narebski <jnareb@gmail.com> wrote:
> Also, when thinking about different scenarios of why one would like to
> mark commit as 'secret', we might want to be able to mark commit as
> secret / unpublishable with respect to _subset_ of remotes, so e.g.
> I am prevented from accidentally publishing commits marked as 'secret'
> to public repository, or to CI/QA repository, but I can push (perhaps
> with warning) to group repository, together with 'secret'-ness state
> of said commit...
>
> ... though it wouldn't be as much 'secret' as 'confidential' ;-)

Another way to achieve this would be to have a config flag to control
whether Git checks for the 'secret' flag before pushing. This config
flag could be set at the system/user level (to enable/disable the
feature as a whole), at the repo level (to enable/disable it in a
given repo), at the remote level (to enable/disable it on a given
repo), and finally at the branch level (to enable-disable it for a
given branch (and its upstream)). Thus you could have a .git/config
that looked like this:

  [core]
  refusePushSecret = true

  [remote "foo"]
  refusePushSecret = false
  url = ...
  fetch = ...

  [branch "baz"]
  remote = foo
  merge = refs/heads/baz
  refusePushSecret = true

This config would:

 - refuse to push 'secret' commits from branch 'baz'
(branch.baz.refusePushSecret == true)

 - but allow to push other branches with 'secret' commits to remote
'foo' (remote.foo.refusePushSecret == false)

 - but refuse to push 'secret' commits to other remotes
(core.refusePushSecret == true)

(The order of precedence would be: branch config > remote config >
repo config > user config > system config > default when unset)

I am unsure whether the 'secret'-ness of a commit should follow across
the push, but if you do (assuming we store the 'secret' flag using
git-notes) this is simply a matter of synchronizing the
refs/notes/secret to the same remote.


Have fun! :)

...Johan

-- 
Johan Herland, <johan@herland.net>
www.herland.net

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFD] Rewriting safety - warn before/when rewriting published history
  2012-02-07 15:09                     ` Johan Herland
@ 2012-02-10 19:38                       ` Jakub Narebski
  2012-02-10 20:19                         ` Philip Oakley
  2012-02-11 13:10                         ` Johan Herland
  2012-04-07 14:49                       ` Steven Michalske
  1 sibling, 2 replies; 34+ messages in thread
From: Jakub Narebski @ 2012-02-10 19:38 UTC (permalink / raw)
  To: Johan Herland; +Cc: Philip Oakley, git

On Tue, 7 Feb 2012, Johan Herland wrote:

> (we are pretty much in violent agreement, so I will only comment where
> I find it necessary)

So now comes the hard part: actually implementing (well, designing and
implementing) prototypes for 'secret' trait and 'public' trait...
 
> On Tue, Feb 7, 2012 at 15:31, Jakub Narebski <jnareb@gmail.com> wrote:

> > Also, when thinking about different scenarios of why one would like to
> > mark commit as 'secret', we might want to be able to mark commit as
> > secret / unpublishable with respect to _subset_ of remotes, so e.g.
> > I am prevented from accidentally publishing commits marked as 'secret'
> > to public repository, or to CI/QA repository, but I can push (perhaps
> > with warning) to group repository, together with 'secret'-ness state
> > of said commit...
> >
> > ... though it wouldn't be as much 'secret' as 'confidential' ;-)
> 
> Another way to achieve this would be to have a config flag to control
> whether Git checks for the 'secret' flag before pushing. This config
> flag could be set at the system/user level (to enable/disable the
> feature as a whole), at the repo level (to enable/disable it in a
> given repo), at the remote level (to enable/disable it on a given
> repo), and finally at the branch level (to enable-disable it for a
> given branch (and its upstream)). Thus you could have a .git/config
> that looked like this:
> 
>   [core]
>   refusePushSecret = true
> 
>   [remote "foo"]
>   refusePushSecret = false
>   url = ...
>   fetch = ...
> 
>   [branch "baz"]
>   remote = foo
>   merge = refs/heads/baz
>   refusePushSecret = true
> 
> This config would:
> 
>  - refuse to push 'secret' commits from branch 'baz'
> (branch.baz.refusePushSecret == true)
> 
>  - but allow to push other branches with 'secret' commits to remote
> 'foo' (remote.foo.refusePushSecret == false)
> 
>  - but refuse to push 'secret' commits to other remotes
> (core.refusePushSecret == true)
> 
> (The order of precedence would be: branch config > remote config >
> repo config > user config > system config > default when unset)

You read my mind.

> I am unsure whether the 'secret'-ness of a commit should follow across
> the push, but if you do (assuming we store the 'secret' flag using
> git-notes) this is simply a matter of synchronizing the
> refs/notes/secret to the same remote.

I think it should, so that 'secret' commit would not escape by accident
via a group secret repository.

What makes it hard (I think) is that we would prefer to transfer 
'secret'-ness only for pushed commits.  That might be problem for notes
based implementation of 'secret' annotation and 'secret'-ness transfer...
though I guess knowing that there exist 'secret' commit with given SHA1
which we do not have and should not have is not much breach of
confidentiality.  Still...

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFD] Rewriting safety - warn before/when rewriting published history
  2012-02-10 19:38                       ` Jakub Narebski
@ 2012-02-10 20:19                         ` Philip Oakley
  2012-02-11 13:10                         ` Johan Herland
  1 sibling, 0 replies; 34+ messages in thread
From: Philip Oakley @ 2012-02-10 20:19 UTC (permalink / raw)
  To: Jakub Narebski, Johan Herland; +Cc: git

From: "Jakub Narebski" <jnareb@gmail.com> Sent: Friday, February 10, 2012 
7:38 PM
> On Tue, 7 Feb 2012, Johan Herland wrote:
>
>> (we are pretty much in violent agreement, so I will only comment where
>> I find it necessary)
>
> So now comes the hard part: actually implementing (well, designing and
> implementing) prototypes for 'secret' trait and 'public' trait...

It all sounds very sensible.

The one additional implemenation idea I'd like is to have, which is somewhat 
at a tangent to the secrecy issue, is having some record of what was pushed 
(made Public). For the sub-module case this is so that its super project can 
know if its submodules are public or not. A public super project that has 
'secret' sub-modules is awkward to say the least.

That's my thought anyway...

>
>> On Tue, Feb 7, 2012 at 15:31, Jakub Narebski <jnareb@gmail.com> wrote:
>
>> > Also, when thinking about different scenarios of why one would like to
>> > mark commit as 'secret', we might want to be able to mark commit as
>> > secret / unpublishable with respect to _subset_ of remotes, so e.g.
>> > I am prevented from accidentally publishing commits marked as 'secret'
>> > to public repository, or to CI/QA repository, but I can push (perhaps
>> > with warning) to group repository, together with 'secret'-ness state
>> > of said commit...
>> >
>> > ... though it wouldn't be as much 'secret' as 'confidential' ;-)
>>
>> Another way to achieve this would be to have a config flag to control
>> whether Git checks for the 'secret' flag before pushing. This config
>> flag could be set at the system/user level (to enable/disable the
>> feature as a whole), at the repo level (to enable/disable it in a
>> given repo), at the remote level (to enable/disable it on a given
>> repo), and finally at the branch level (to enable-disable it for a
>> given branch (and its upstream)). Thus you could have a .git/config
>> that looked like this:
>>
>>   [core]
>>   refusePushSecret = true
>>
>>   [remote "foo"]
>>   refusePushSecret = false
>>   url = ...
>>   fetch = ...
>>
>>   [branch "baz"]
>>   remote = foo
>>   merge = refs/heads/baz
>>   refusePushSecret = true
>>
>> This config would:
>>
>>  - refuse to push 'secret' commits from branch 'baz'
>> (branch.baz.refusePushSecret == true)
>>
>>  - but allow to push other branches with 'secret' commits to remote
>> 'foo' (remote.foo.refusePushSecret == false)
>>
>>  - but refuse to push 'secret' commits to other remotes
>> (core.refusePushSecret == true)
>>
>> (The order of precedence would be: branch config > remote config >
>> repo config > user config > system config > default when unset)
>
> You read my mind.
>
>> I am unsure whether the 'secret'-ness of a commit should follow across
>> the push, but if you do (assuming we store the 'secret' flag using
>> git-notes) this is simply a matter of synchronizing the
>> refs/notes/secret to the same remote.
>
> I think it should, so that 'secret' commit would not escape by accident
> via a group secret repository.
>
> What makes it hard (I think) is that we would prefer to transfer
> 'secret'-ness only for pushed commits.  That might be problem for notes
> based implementation of 'secret' annotation and 'secret'-ness transfer...
> though I guess knowing that there exist 'secret' commit with given SHA1
> which we do not have and should not have is not much breach of
> confidentiality.  Still...
>
> -- 
> Jakub Narebski
> Poland
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
> -----
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 2012.0.1913 / Virus Database: 2112/4801 - Release Date: 02/10/12
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFD] Rewriting safety - warn before/when rewriting published history
  2012-02-10 19:38                       ` Jakub Narebski
  2012-02-10 20:19                         ` Philip Oakley
@ 2012-02-11 13:10                         ` Johan Herland
  2012-02-11 13:45                           ` Jakub Narebski
  2012-04-07 15:01                           ` [RFD] Rewriting safety - warn before/when rewriting published history Steven Michalske
  1 sibling, 2 replies; 34+ messages in thread
From: Johan Herland @ 2012-02-11 13:10 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Philip Oakley, git

On Fri, Feb 10, 2012 at 20:38, Jakub Narebski <jnareb@gmail.com> wrote:
> On Tue, 7 Feb 2012, Johan Herland wrote:
>> On Tue, Feb 7, 2012 at 15:31, Jakub Narebski <jnareb@gmail.com> wrote:
>> I am unsure whether the 'secret'-ness of a commit should follow across
>> the push, but if you do (assuming we store the 'secret' flag using
>> git-notes) this is simply a matter of synchronizing the
>> refs/notes/secret to the same remote.
>
> I think it should, so that 'secret' commit would not escape by accident
> via a group secret repository.
>
> What makes it hard (I think) is that we would prefer to transfer
> 'secret'-ness only for pushed commits.  That might be problem for notes
> based implementation of 'secret' annotation and 'secret'-ness transfer...
> though I guess knowing that there exist 'secret' commit with given SHA1
> which we do not have and should not have is not much breach of
> confidentiality.  Still...

If you don't want to transfer all of refs/notes/secret, you would
probably have to extend the git protocol with a per-commit 'secret'
flag (which would then be applied to the receiving repo's
refs/notes/secret).

Still, this is all specific to the 'secret' feature, which IMHO is
much less important then the 'public' feature. Implementing the
barebones 'public' feature (i.e. refuse rewrite of commits reachable
from upstream) is much less work, and would be enough for 90% of git
users, I believe.


...Johan

-- 
Johan Herland, <johan@herland.net>
www.herland.net

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFD] Rewriting safety - warn before/when rewriting published history
  2012-02-11 13:10                         ` Johan Herland
@ 2012-02-11 13:45                           ` Jakub Narebski
  2012-02-20 21:07                             ` [RFC] pre-rebase: Refuse to rewrite commits that are reachable from upstream Johan Herland
  2012-04-07 15:01                           ` [RFD] Rewriting safety - warn before/when rewriting published history Steven Michalske
  1 sibling, 1 reply; 34+ messages in thread
From: Jakub Narebski @ 2012-02-11 13:45 UTC (permalink / raw)
  To: Johan Herland; +Cc: Philip Oakley, git

On Sat, 11 Feb 2012, Johan Herland wrote:
> On Fri, Feb 10, 2012 at 20:38, Jakub Narebski <jnareb@gmail.com> wrote:
> > On Tue, 7 Feb 2012, Johan Herland wrote:
[...]
> > > I am unsure whether the 'secret'-ness of a commit should follow across
> > > the push, but if you do (assuming we store the 'secret' flag using
> > > git-notes) this is simply a matter of synchronizing the
> > > refs/notes/secret to the same remote.
> >
> > I think it should, so that 'secret' commit would not escape by accident
> > via a group secret repository.
> >
> > What makes it hard (I think) is that we would prefer to transfer
> > 'secret'-ness only for pushed commits.  That might be problem for notes
> > based implementation of 'secret' annotation and 'secret'-ness transfer...
> > though I guess knowing that there exist 'secret' commit with given SHA1
> > which we do not have and should not have is not much breach of
> > confidentiality.  Still...
> 
> If you don't want to transfer all of refs/notes/secret, you would
> probably have to extend the git protocol with a per-commit 'secret'
> flag (which would then be applied to the receiving repo's
> refs/notes/secret).

Or create per-remote custom notes ref, for example for 'foo' remote it
would be 'refs/remotes/foo/notes/secret'... assuming that canonalization
of remote-tracking refs goes in (refs/remotes/foo/{heads,tags,notes,replace})

This would be updated with 'secret'-ness state of comits being pushed
before actual push, and secretness notes ref would be pushed together
with actual push.

> Still, this is all specific to the 'secret' feature, which IMHO is
> much less important then the 'public' feature. Implementing the
> barebones 'public' feature (i.e. refuse rewrite of commits reachable
> from upstream) is much less work, and would be enough for 90% of git
> users, I believe.

Hmmm... I thought that is 'public' feature that is more work, especially
in full incarnation.  But perhaps bare bones (no 'pre-push' or 'pre-rewrite'
hooks, no traits info in "git log" output, per remote tracking of 'public'
status only, no support for bundles or send-email, etc.) could be as easy
or easier.

As to what is more important for git users... perhaps short survey would
help here?  I wonder if asking Mercurial users about their use of "phases"
on their mailing would help us...

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [RFC] pre-rebase: Refuse to rewrite commits that are reachable from upstream
  2012-02-11 13:45                           ` Jakub Narebski
@ 2012-02-20 21:07                             ` Johan Herland
  2012-02-20 21:21                               ` Johan Herland
  2012-02-20 22:43                               ` Junio C Hamano
  0 siblings, 2 replies; 34+ messages in thread
From: Johan Herland @ 2012-02-20 21:07 UTC (permalink / raw)
  To: git; +Cc: Johan Herland, gitster, jnareb, philipoakley

Teach the pre-rebase sample hook to refuse rewriting commits on a branch
that are present in that branch's @{upstream}. This is to prevent users
from rewriting commits that have already been published.

If the branch has no @{upstream}, or the commits-to-be-rebased are not
reachable from the upstream (hence assumed to be unpublished), the rebase
is not refused.

This patch is not an ideal solution to the problem, for at least the
following reasons:

 - There is no way for the user to override this check, except skipping
   the pre-rebase hook entirely with --no-verify.

 - The check only works for branches with a configured upstream. If the
   user's workflow does not rely on upstream branches, or uses some other
   method of publishing commits, the check will produce false negatives
   (i.e. allow rebases that should have been refused).

 - The check only applies to rebase. I wanted to add the same check
   on 'commit --amend', but there's no obvious way to detect --amend
   from within the pre-commit hook.

 - There may be other rewrite scenarios where we want to do this check,
   such as 'git reset'. Maybe a pre-rewrite hook should be added?

 - Some (including myself) want this check to be performed by default,
   since it's mostly targeted at newbies that are less likely to enable
   the pre-rebase (pre-rewrite) hook, so maybe the check should be added
   to core git instead.

Discussed-with: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Johan Herland <johan@herland.net>
---
 templates/hooks--pre-rebase.sample |   14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/templates/hooks--pre-rebase.sample b/templates/hooks--pre-rebase.sample
index 053f111..4c28b27 100755
--- a/templates/hooks--pre-rebase.sample
+++ b/templates/hooks--pre-rebase.sample
@@ -25,6 +25,20 @@ else
 	exit 0 ;# we do not interrupt rebasing detached HEAD
 fi
 
+# Are we rewriting upstreamed commits?
+upstream=`git rev-parse --verify "${topic#refs/heads/}@{u}" 2>/dev/null`
+if test -n "$upstream"
+then
+	# See if any of the commits to be rebased are reachable from upstream.
+	basecommit=`git rev-parse --verify "$basebranch"`
+	mergebase=`git merge-base "$basecommit" "$upstream"`
+	if test "$basecommit" != "$upstream" -a "$basecommit" = "$mergebase"
+	then
+		echo >&2 "Cannot rebase commits that are in $topic's upstream"
+		exit 1
+	fi
+fi
+
 case "$topic" in
 refs/heads/??/*)
 	;;
-- 
1.7.9.1.314.ga9004

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [RFC] pre-rebase: Refuse to rewrite commits that are reachable from upstream
  2012-02-20 21:07                             ` [RFC] pre-rebase: Refuse to rewrite commits that are reachable from upstream Johan Herland
@ 2012-02-20 21:21                               ` Johan Herland
  2012-02-20 22:43                               ` Junio C Hamano
  1 sibling, 0 replies; 34+ messages in thread
From: Johan Herland @ 2012-02-20 21:21 UTC (permalink / raw)
  To: git; +Cc: Johan Herland, gitster, jnareb, philipoakley

On Mon, Feb 20, 2012 at 22:07, Johan Herland <johan@herland.net> wrote:
> Teach the pre-rebase sample hook to refuse rewriting commits on a branch
> that are present in that branch's @{upstream}. This is to prevent users
> from rewriting commits that have already been published.
>
> If the branch has no @{upstream}, or the commits-to-be-rebased are not
> reachable from the upstream (hence assumed to be unpublished), the rebase
> is not refused.
>
> This patch is not an ideal solution to the problem, for at least the
> following reasons:
>
>  - There is no way for the user to override this check, except skipping
>   the pre-rebase hook entirely with --no-verify.
>
>  - The check only works for branches with a configured upstream. If the
>   user's workflow does not rely on upstream branches, or uses some other
>   method of publishing commits, the check will produce false negatives
>   (i.e. allow rebases that should have been refused).
>
>  - The check only applies to rebase. I wanted to add the same check
>   on 'commit --amend', but there's no obvious way to detect --amend
>   from within the pre-commit hook.
>
>  - There may be other rewrite scenarios where we want to do this check,
>   such as 'git reset'. Maybe a pre-rewrite hook should be added?
>
>  - Some (including myself) want this check to be performed by default,
>   since it's mostly targeted at newbies that are less likely to enable
>   the pre-rebase (pre-rewrite) hook, so maybe the check should be added
>   to core git instead.
>
> Discussed-with: Jakub Narebski <jnareb@gmail.com>
> Signed-off-by: Johan Herland <johan@herland.net>
> ---

I forgot to explain that this patch is not really submitted for
inclusion, but rather to continue the discussion of getting rewrite
safety properly implemented in git. As such, the problems noted in the
above commit message are probably more important than the patch
itself...

Also, this implements only a small subset of what has been discussed
regarding 'public' and 'secret' properties of commits in the preceding
thread. However, I believe solving this part of the problem
(preventing upstreamed commits from being rewritten) will make 90% of
users happy, and that it's worth fixing on its own merits.


Have fun! :)

...Johan

-- 
Johan Herland, <johan@herland.net>
www.herland.net

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC] pre-rebase: Refuse to rewrite commits that are reachable from upstream
  2012-02-20 21:07                             ` [RFC] pre-rebase: Refuse to rewrite commits that are reachable from upstream Johan Herland
  2012-02-20 21:21                               ` Johan Herland
@ 2012-02-20 22:43                               ` Junio C Hamano
  2012-02-21  0:03                                 ` Johan Herland
  1 sibling, 1 reply; 34+ messages in thread
From: Junio C Hamano @ 2012-02-20 22:43 UTC (permalink / raw)
  To: Johan Herland; +Cc: git, jnareb, philipoakley

Johan Herland <johan@herland.net> writes:

> Teach the pre-rebase sample hook to refuse rewriting commits on a branch
> that are present in that branch's @{upstream}. This is to prevent users
> from rewriting commits that have already been published.

If the user has configured an option to always create @{u} when creating a
branch from somewhere else, transplanting $n good commits from his master
that is forked from the shared master onto his maint would be done like
this:

	$ git checkout -b copy master
        $ git rebase -i --onto maint HEAD~$n

If these good commits have been published to 'master', because the
upstream of 'copy' is set to the local 'master', would the new mechanism
hinder this attempt to backport good fixes?  Perhaps it is safer to
trigger only when @{u} exists and it is not local?

But because you wanted to discuss more about the issues than the
implementation, let me think aloud a bit, reviewing what I usually do.

I keep things simpler by sticking to a very simple rule. I allow myself to
rebase only what is not yet in 'next', so the logic becomes a simple "am I
creating a new commit based on what is already in 'next'?"

During the course of integration testing with 'next', however, I often
find a topic or two that I have merged to it is less than ideal, and of
course, the whole point of doing integration testing with 'next' is to
find such problematic topics before pushing 'next' out.  I rewind 'next',
rebuild the problematic topics, and then rebuild 'next', and all of these
happen before 'next' is pushed out.  The step that rewinds 'next' that
acquired problematic versions of the topics makes the topics eligible for
rebase.

That would mean that a configuration variable "rebase.bottomLimit = next"
is sufficient to implement such a check for me.  No per-branch bottom is
needed, because everything is merged to 'next' and tested to see if they
do not need further rebases for fixing them up before they are published.

Perhaps "I mistakenly rebased something that I have already published" is
a mere symptom a bigger problem.  The issue may not be that we do not give
them a good tool to help them to be more careful with less effort on their
part before they rebase.  It may instead be that it is too easy to publish
branches that are not ready to be pushed out, and that is the real cause
of the "I realized I need to fix the topic and I fixed it, but I did not
realize that it was too late and I shouldn't have rebased" problem.

I wonder if it would be a more direct solution to the issue you are
raising to give them a good tool to help them to be more careful with less
effort on their part before they publish (not before they rebase).

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC] pre-rebase: Refuse to rewrite commits that are reachable from upstream
  2012-02-20 22:43                               ` Junio C Hamano
@ 2012-02-21  0:03                                 ` Johan Herland
  2012-02-21  7:44                                   ` Junio C Hamano
  0 siblings, 1 reply; 34+ messages in thread
From: Johan Herland @ 2012-02-21  0:03 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, jnareb, philipoakley

On Mon, Feb 20, 2012 at 23:43, Junio C Hamano <gitster@pobox.com> wrote:
> Johan Herland <johan@herland.net> writes:
>> Teach the pre-rebase sample hook to refuse rewriting commits on a branch
>> that are present in that branch's @{upstream}. This is to prevent users
>> from rewriting commits that have already been published.
>
> If the user has configured an option to always create @{u} when creating a
> branch from somewhere else, transplanting $n good commits from his master
> that is forked from the shared master onto his maint would be done like
> this:
>
>        $ git checkout -b copy master
>        $ git rebase -i --onto maint HEAD~$n
>
> If these good commits have been published to 'master', because the
> upstream of 'copy' is set to the local 'master', would the new mechanism
> hinder this attempt to backport good fixes?  Perhaps it is safer to
> trigger only when @{u} exists and it is not local?

Ah, you're talking about setting branch.autosetupmerge = always. I
didn't know about that until I looked it up, just now. My first
impression is "who would want to use that?", but since it's there, we
should not break it. So, yes, we should probably apply the check when
@{u} exists, and refers to a remote-tracking branch. (One could argue
that this is also unsafe, since the current repo is a perfectly valid
remote, but if you do that, I'd be inclined to let you shoot yourself
in the foot.)

Also, if this check becomes part of core git, we obviously want some
config option to en/disable it. Preferably at multiple levels, as
discussed earlier in this thread:

  core.rewriteUpstream = refuse/false, warn, allow/true
(en/disables the check for the entire repo, but may be overridden by:)

  remote.<name>.rewriteUpstream = refuse/false, warn, allow/true
(en/disables the check for branches whose upstream is the given
remote, but may be overridden by:)

  branch.<name>.rewriteUpstream = refuse/false, warn, allow/true
(en/disables the check for the given local branch)

This allows fine-grained control of which branches/remotes you
consider 'public', and which are ok to rewrite.

> But because you wanted to discuss more about the issues than the
> implementation, let me think aloud a bit, reviewing what I usually do.
>
> I keep things simpler by sticking to a very simple rule. I allow myself to
> rebase only what is not yet in 'next', so the logic becomes a simple "am I
> creating a new commit based on what is already in 'next'?"
>
> During the course of integration testing with 'next', however, I often
> find a topic or two that I have merged to it is less than ideal, and of
> course, the whole point of doing integration testing with 'next' is to
> find such problematic topics before pushing 'next' out.  I rewind 'next',
> rebuild the problematic topics, and then rebuild 'next', and all of these
> happen before 'next' is pushed out.  The step that rewinds 'next' that
> acquired problematic versions of the topics makes the topics eligible for
> rebase.
>
> That would mean that a configuration variable "rebase.bottomLimit = next"
> is sufficient to implement such a check for me.  No per-branch bottom is
> needed, because everything is merged to 'next' and tested to see if they
> do not need further rebases for fixing them up before they are published.

What  you are describing here may be a common workflow, but
"rebase.bottomLimit" is still very specific to that kind of workflow.
What I'm after is a much more workflow-agnostic concept of:

  "If I have pushed something, I should probably not rebase it"

I believe this rule is sufficiently common in most workflows to
warrant the addition of this safety-check to Git. (And making the
safety-check configurable caters to those that don't need/want it.)

> Perhaps "I mistakenly rebased something that I have already published" is
> a mere symptom a bigger problem.  The issue may not be that we do not give
> them a good tool to help them to be more careful with less effort on their
> part before they rebase.  It may instead be that it is too easy to publish
> branches that are not ready to be pushed out, and that is the real cause
> of the "I realized I need to fix the topic and I fixed it, but I did not
> realize that it was too late and I shouldn't have rebased" problem.
>
> I wonder if it would be a more direct solution to the issue you are
> raising to give them a good tool to help them to be more careful with less
> effort on their part before they publish (not before they rebase).

Yes, in many cases, the user will probably agree that the rewrite
should ideally have happened _before_ the branch was first published.
However, I'm not sure how we can help the user _not_ publish the
branch until it's ready. We could throw in a warning (or even a stupid
"Are  you really sure you want to publish?") before pushing a branch,
but I don't think it would help at all.

I think the following decribes what often happens for many users:

 1. User A pushes the branch to a public repo.
 2. User B points out a simple mistake in the branch.
 3. User A makes a fix
 4. User A squashes the fix into the (already-published) history.
 5. User A attempts to push the "fixed" history (but is rejected by
the public repo because of non-fast-forward).

At this point, the damage is already done, since often neither of the
following alternatives are desirable:

 6a. (safe) User A realizes the error and undoes the history-rewrite
(this is not easy for novice users), leaving the fix on top of the
already-published history, and re-pushes a fast-forwarding history.
OR
 6b. (dangerous) User A reattempts the push with --force. User B (and
user C, D, ...) is left to clean up the mess.

You could say that User A should be more careful and push to a "less
public" repo in step #1 (thus allowing the fix to be squashed before
pushing to a "more public" repo in step #5), but how "public" is
"public" enough to have someone point out the bug, but still
"unpublic" enough to allow rebase?

In general, I really cannot see any opportunity before step #4 where
we can stop (or warn) the user from what is about to happen. And I
think that refusing rewrites of commits that are already present in
the @{upstream} remote-tracking branch is good enough to help most
users avoid steps #4 through #6 (in a push-based workflow[1]). In
fact, from a pedagogical POV, I think step #4 is probably the best
spot for novice users to learn exactly the distinction between
acceptable and unacceptable history rewrites (instead of having it
explained to them as a consequence of the step #5).

Thanks for your insight,

...Johan

[1]: For non-push-based workflows (such as send-email, sharing
bundles, pulls from local repo, etc.) we will need other methods for
determining which commits have been published. This have been
discussed earlier in the thread, but is outside the scope of what I
want to accomplish for now.

-- 
Johan Herland, <johan@herland.net>
www.herland.net

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC] pre-rebase: Refuse to rewrite commits that are reachable from upstream
  2012-02-21  0:03                                 ` Johan Herland
@ 2012-02-21  7:44                                   ` Junio C Hamano
  2012-02-21 23:23                                     ` Johan Herland
  0 siblings, 1 reply; 34+ messages in thread
From: Junio C Hamano @ 2012-02-21  7:44 UTC (permalink / raw)
  To: Johan Herland; +Cc: git, jnareb, philipoakley

Johan Herland <johan@herland.net> writes:

> What  you are describing here may be a common workflow, but
> "rebase.bottomLimit" is still very specific to that kind of workflow.
> What I'm after is a much more workflow-agnostic concept of:
>
>   "If I have pushed something, I should probably not rebase it"

Your "this branch pushes directly to that remote branch, so I can check if
it will result in rewrite of published commit" is even *less* generic than
having a single bottomLimit in my illustration.

I may not push out my topic branches directly, only the aggregate of them
in 'next', but once 'next' is pushed out, they are not eligible for
rebasing.  A per-branch bottom, e.g. rebase.$branch.bottomLimit, might
make it more flexible to cover such a case, though.

On the other hand, without any such safety, a merge to 'next' would give
many conflicts and "shortlog master..next" will show many duplicates after
any topic that are already merged to it are accidentally rewritten, and it
is just the matter of using reflog on topic branches to recover from such
a mistake.

>> I wonder if it would be a more direct solution to the issue you are
>> raising to give them a good tool to help them to be more careful with less
>> effort on their part before they publish (not before they rebase).
> 
> ..., I'm not sure how we can help the user _not_ publish the
> branch until it's ready.

I think we are in agreement that we do not think of a good solution
offhand to the real cause of the issue, except by encouraging the use of
throw-away review branches, perhaps.

> I think the following decribes what often happens for many users:
>
>  1. User A pushes the branch to a public repo.
>  2. User B points out a simple mistake in the branch.

That's the CVS workflow, and it is not "a" public repo but "the" public
repo shared between A and B (and also with all the project participants).

>  3. User A makes a fix
>  4. User A squashes the fix into the (already-published) history.
>  5. User A attempts to push the "fixed" history (but is rejected by
> the public repo because of non-fast-forward).
> At this point, the damage is already done,...

Which is probably a sufficient safety which the user can learn from.  If
this happens too often, that probably means we are not helping them enough
to learn not to "commit --amend" or "rebase" if they are using Git as a
better CVS.

> You could say that User A should be more careful and push to a "less
> public" repo in step #1 (thus allowing the fix to be squashed before
> pushing to a "more public" repo in step #5),...

That is essentially a workflow that uses throw-away review branches in a
distributed environment, and at that point, we are not constrained by the
limitation of the CVS workflow. While still in early review cycles (which
corresponds to being in our 'pu'), "commit --amend" and "rebase" are fine
tool to be used.  And...

> but how "public" is
> "public" enough to have someone point out the bug, but still
> "unpublic" enough to allow rebase?

... I can imagine that currently that is determined purely by project
convention. Perhaps there needs a way to mark throw-away review branches
like 'pu' (or saying the same thing from the different perspective, to
mark cast-in-stone integration branches like 'next') so that tools can
mechanically decide what should and should not be rewritten.

To extend the idea of promoting throw-away review branches further,
perhaps it might help if there is an easy way to let the users publish
their "master" to a branch that is not the "master" of the central shared
repository even in the CVS workflow (e.g. by default a push from user A
always goes to refs/review/A/master), and to have an option to "git push"
that makes it go to the "master" when the user really means the branch is
ready (and it would move refs/review/A/master to attic to be later gc'ed).

> ... And I
> think that refusing rewrites of commits that are already present in
> the @{upstream} remote-tracking branch is good enough to help most
> users avoid steps #4 through #6 (in a push-based workflow[1]).

See above regarding branches that should not be rebased even if they are
not directly pushed out.

> In
> fact, from a pedagogical POV, I think step #4 is probably the best
> spot for novice users to learn exactly the distinction between
> acceptable and unacceptable history rewrites (instead of having it
> explained to them as a consequence of the step #5).

I doubt you have enough information at point #4, unless you restrict the
workflow you allow your novice users to use fairly severely, to give
appropriate advice.  While I agree with you that it would be the best if
we could do so at step #4 without stopping the user from doing what s/he
needs to do with false positive, I think it is not pedagogical POV but
dreaming if the world were ideal without knowing what it would take to
make it ideal.

At least I don't know offhand what kind of changes are needed to restrict
the user actions to an "approved" workflow so that step #4 can make a
useful decision (that is, no false positives and small enough false
negatives).

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC] pre-rebase: Refuse to rewrite commits that are reachable from upstream
  2012-02-21  7:44                                   ` Junio C Hamano
@ 2012-02-21 23:23                                     ` Johan Herland
  2012-02-21 23:43                                       ` Junio C Hamano
  0 siblings, 1 reply; 34+ messages in thread
From: Johan Herland @ 2012-02-21 23:23 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, jnareb, philipoakley

On Tue, Feb 21, 2012 at 08:44, Junio C Hamano <gitster@pobox.com> wrote:
> Johan Herland <johan@herland.net> writes:
>
>> What  you are describing here may be a common workflow, but
>> "rebase.bottomLimit" is still very specific to that kind of workflow.
>> What I'm after is a much more workflow-agnostic concept of:
>>
>>   "If I have pushed something, I should probably not rebase it"
>
> Your "this branch pushes directly to that remote branch, so I can check if
> it will result in rewrite of published commit" is even *less* generic than
> having a single bottomLimit in my illustration.
>
> I may not push out my topic branches directly, only the aggregate of them
> in 'next', but once 'next' is pushed out, they are not eligible for
> rebasing.  A per-branch bottom, e.g. rebase.$branch.bottomLimit, might
> make it more flexible to cover such a case, though.

But you would have to maintain rebase.$branch.bottomLimit separately
for each topic branch, updating it whenever the topic branch is merged
to 'next'. Is that practical?

> On the other hand, without any such safety, a merge to 'next' would give
> many conflicts and "shortlog master..next" will show many duplicates after
> any topic that are already merged to it are accidentally rewritten, and it
> is just the matter of using reflog on topic branches to recover from such
> a mistake.

All true, but my experience is that novice users will not enjoy this
experience, and rather become frustrated with Git for "putting them in
this situation" (i.e. giving them more rope than they can handle).
Maybe I'm doing something wrong...

>>> I wonder if it would be a more direct solution to the issue you are
>>> raising to give them a good tool to help them to be more careful with less
>>> effort on their part before they publish (not before they rebase).
>>
>> ..., I'm not sure how we can help the user _not_ publish the
>> branch until it's ready.
>
> I think we are in agreement that we do not think of a good solution
> offhand to the real cause of the issue, except by encouraging the use of
> throw-away review branches, perhaps.

Agreed. We should probably focus more on this when teaching novice users.

>> I think the following decribes what often happens for many users:
>>
>>  1. User A pushes the branch to a public repo.
>>  2. User B points out a simple mistake in the branch.
>
> That's the CVS workflow, and it is not "a" public repo but "the" public
> repo shared between A and B (and also with all the project participants).
>
>>  3. User A makes a fix
>>  4. User A squashes the fix into the (already-published) history.
>>  5. User A attempts to push the "fixed" history (but is rejected by
>> the public repo because of non-fast-forward).
>> At this point, the damage is already done,...
>
> Which is probably a sufficient safety which the user can learn from.  If
> this happens too often, that probably means we are not helping them enough
> to learn not to "commit --amend" or "rebase" if they are using Git as a
> better CVS.
>
>> You could say that User A should be more careful and push to a "less
>> public" repo in step #1 (thus allowing the fix to be squashed before
>> pushing to a "more public" repo in step #5),...
>
> That is essentially a workflow that uses throw-away review branches in a
> distributed environment, and at that point, we are not constrained by the
> limitation of the CVS workflow. While still in early review cycles (which
> corresponds to being in our 'pu'), "commit --amend" and "rebase" are fine
> tool to be used.  And...
>
>> but how "public" is
>> "public" enough to have someone point out the bug, but still
>> "unpublic" enough to allow rebase?
>
> ... I can imagine that currently that is determined purely by project
> convention. Perhaps there needs a way to mark throw-away review branches
> like 'pu' (or saying the same thing from the different perspective, to
> mark cast-in-stone integration branches like 'next') so that tools can
> mechanically decide what should and should not be rewritten.

Yes. Maybe it should be a tri-state thing:

  - "rebase-prone" will allow rewrites without asking.

  - unset will warn, but still allow rewrites. The warning should
explain the potential problem that may arise from rewriting, and
should also explain how to get rid of the warning by setting this
config to either of the above/below options. Ideally the warning
should also explain how the user may undo the rewrite. Obviously, most
of this warning should be controlled by a corresponding advice.*
config.

  - "cast-in-stone" will refuse rewrites.

I notice that this config is not really per-branch, but rather per
upstream branch. I wonder how to best encode that in the config...
Also, how a remote repo may best communicate which branches are
"rebase-prone"/"cast-in-stone".

> To extend the idea of promoting throw-away review branches further,
> perhaps it might help if there is an easy way to let the users publish
> their "master" to a branch that is not the "master" of the central shared
> repository even in the CVS workflow (e.g. by default a push from user A
> always goes to refs/review/A/master), and to have an option to "git push"
> that makes it go to the "master" when the user really means the branch is
> ready (and it would move refs/review/A/master to attic to be later gc'ed).

This goes against many workflows, and many users' expectations.
Although it may be a Good(tm) practice, I don't think it's universal
enough to be worth "breaking" 'push'. What about adding a new 'git
review' command which defaults to this behavior (but can be overridden
by Gerrit and other code-review systems to do what's appropriate in
their cases)?

>> ... And I
>> think that refusing rewrites of commits that are already present in
>> the @{upstream} remote-tracking branch is good enough to help most
>> users avoid steps #4 through #6 (in a push-based workflow[1]).
>
> See above regarding branches that should not be rebased even if they are
> not directly pushed out.

True. Is it universally acceptable to assume that a topic branch that
has been merged to a "cast-in-stone" branch, should not be
rebased/rewritten (unless forced by the user)?

There are several aspects to this: First, any commits on the topic
branch that have been made _since_ the merge should obviously be
rewritable (since they have yet to be merged). And commits made
_before_ the merge should probably not be rewritable. I.e. in the
following figure, the 'y' commits should be rewritable, but the 'x'
commits (preceding the merge 'M') should probably not.

 --o---o---o---o---o---o---M  <- master (cast-in-stone)
    \                     /
     x---x---x---x---x---x---y---y <- topicA

But what if we have two topic branches with shared history, like this:

 --o---o---o---o---o---o---M  <- master (cast-in-stone)
    \                     /
     x---x---x---x---x---x---y---y <- topicA
                              \
                               z---z <- topicB

We cannot say whether the 'x' commits "belong" to topicA or topicB.
Also, we cannot determine whether the merge M happened with topicA or
topicB (unless the default merge message has been preserved). Now,
given that we try to rewrite the entire topicB branch (including the
'x' commits): Should we refuse rewriting the 'x' commits because they
are reachable from master (regardless of whether topicA or topicB was
merged), or should we allow rewriting the 'x' commits on the basis
that topicA may be the branch that was merged in 'M', and since topicB
are "unaware" that the 'x' commits have been merged, it should thus be
allowed to rewrite them?

Now, what if you want to backport 'topicA' from 'master' to 'maint'.
You could do so like this:

  git checkout -b topicA_maint topicA
  git rebase --onto maint master

(note that topicA_maint is a special case of topicB in the above
graph) In this case, we would expect the 'x' commits to be rewritable,
or we could not perform the backport.

The conclusion seems to be that we _cannot_ refuse rewriting commits
merely on the basis that they have been merged to a cast-in-stone
branch.

>> In
>> fact, from a pedagogical POV, I think step #4 is probably the best
>> spot for novice users to learn exactly the distinction between
>> acceptable and unacceptable history rewrites (instead of having it
>> explained to them as a consequence of the step #5).
>
> I doubt you have enough information at point #4, unless you restrict the
> workflow you allow your novice users to use fairly severely, to give
> appropriate advice.  While I agree with you that it would be the best if
> we could do so at step #4 without stopping the user from doing what s/he
> needs to do with false positive, I think it is not pedagogical POV but
> dreaming if the world were ideal without knowing what it would take to
> make it ideal.
>
> At least I don't know offhand what kind of changes are needed to restrict
> the user actions to an "approved" workflow so that step #4 can make a
> useful decision (that is, no false positives and small enough false
> negatives).

Ok, trying to make as few assumptions about user workflows as
possible: Assuming there were a (machine-parsable) way to mark
branches as either "rebase-prone" or "cast-in-stone". Could we then
assume that rebasing commits that exist on a cast-in-stone branch's
remote-tracking @{upstream} should be refused by default? I'm trying
(without much success, it seems) to find _something_ that will help
the 1525 users that want Git to "warn before/when rewriting published
history"[1], but won't cause any false positives (and small enough
false negatives). If there really is no way to implement this, then we
shouldn't give users false hopes by putting it in the survey...

Have fun! :)

...Johan

[1]: Question 17 in https://www.survs.com/results/Q5CA9SKQ/P7DE07F0PL

--
Johan Herland, <johan@herland.net>
www.herland.net

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC] pre-rebase: Refuse to rewrite commits that are reachable from upstream
  2012-02-21 23:23                                     ` Johan Herland
@ 2012-02-21 23:43                                       ` Junio C Hamano
  2012-02-21 23:59                                         ` Dave Zarzycki
  0 siblings, 1 reply; 34+ messages in thread
From: Junio C Hamano @ 2012-02-21 23:43 UTC (permalink / raw)
  To: Johan Herland; +Cc: Junio C Hamano, git, jnareb, philipoakley

Johan Herland <johan@herland.net> writes:

> history"[1], but won't cause any false positives (and small enough
> false negatives). If there really is no way to implement this, then we
> shouldn't give users false hopes by putting it in the survey...

I think that question should be "warn before pushing out a commit that the
user may later regret to have pushed out" ;-)

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC] pre-rebase: Refuse to rewrite commits that are reachable from upstream
  2012-02-21 23:43                                       ` Junio C Hamano
@ 2012-02-21 23:59                                         ` Dave Zarzycki
  2012-02-22  7:09                                           ` Jeff King
  0 siblings, 1 reply; 34+ messages in thread
From: Dave Zarzycki @ 2012-02-21 23:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johan Herland, git, jnareb, philipoakley

On Feb 21, 2012, at 6:43 PM, Junio C Hamano <gitster@pobox.com> wrote:

> Johan Herland <johan@herland.net> writes:
> 
>> history"[1], but won't cause any false positives (and small enough
>> false negatives). If there really is no way to implement this, then we
>> shouldn't give users false hopes by putting it in the survey...
> 
> I think that question should be "warn before pushing out a commit that the
> user may later regret to have pushed out" ;-)

Why limit this proposal to just the commits that are reachable from upstream? What if somebody pulls from your repo?

In other words, wouldn't it be better to have a git track "unshared" commits and only let those be rewritten? The theory being that if the given commits haven't been pushed or pulled anywhere, then they are safe to rewrite.

davez

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC] pre-rebase: Refuse to rewrite commits that are reachable from upstream
  2012-02-21 23:59                                         ` Dave Zarzycki
@ 2012-02-22  7:09                                           ` Jeff King
  2012-02-22  8:00                                             ` Dave Zarzycki
  0 siblings, 1 reply; 34+ messages in thread
From: Jeff King @ 2012-02-22  7:09 UTC (permalink / raw)
  To: Dave Zarzycki; +Cc: Junio C Hamano, Johan Herland, git, jnareb, philipoakley

On Tue, Feb 21, 2012 at 06:59:38PM -0500, Dave Zarzycki wrote:

> > I think that question should be "warn before pushing out a commit that the
> > user may later regret to have pushed out" ;-)
> 
> Why limit this proposal to just the commits that are reachable from
> upstream? What if somebody pulls from your repo?
> 
> In other words, wouldn't it be better to have a git track "unshared"
> commits and only let those be rewritten? The theory being that if the
> given commits haven't been pushed or pulled anywhere, then they are
> safe to rewrite.

You don't necessarily know who has read from you. Depending on your
setup, the user running git code may not have write access to the
repository (e.g., Alice runs "git pull ~bob/project.git"). Where would
Alice write the list of commits she pulled so that when Bob later runs
git, he knows that she has pulled them?

There is also the issue of "dumb" transports in which no git code is
running on the remote repo at all (e.g., Alice fetches from Bob via dumb
http; Bob's server doesn't even have git at all).

There may be clever or complex ideas to tackle those problems, but I
suspect that handling push would cover most practical cases (e.g., in
the dumb http case, Bob's commits probably ended up on the server via
push). So perhaps it is a good place to start.

-Peff

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC] pre-rebase: Refuse to rewrite commits that are reachable from upstream
  2012-02-22  7:09                                           ` Jeff King
@ 2012-02-22  8:00                                             ` Dave Zarzycki
  0 siblings, 0 replies; 34+ messages in thread
From: Dave Zarzycki @ 2012-02-22  8:00 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Johan Herland, git, jnareb, philipoakley

On Feb 22, 2012, at 2:09 AM, Jeff King <peff@peff.net> wrote:

> On Tue, Feb 21, 2012 at 06:59:38PM -0500, Dave Zarzycki wrote:
> 
>>> I think that question should be "warn before pushing out a commit that the
>>> user may later regret to have pushed out" ;-)
>> 
>> Why limit this proposal to just the commits that are reachable from
>> upstream? What if somebody pulls from your repo?
>> 
>> In other words, wouldn't it be better to have a git track "unshared"
>> commits and only let those be rewritten? The theory being that if the
>> given commits haven't been pushed or pulled anywhere, then they are
>> safe to rewrite.
> 
> You don't necessarily know who has read from you. Depending on your
> setup, the user running git code may not have write access to the
> repository (e.g., Alice runs "git pull ~bob/project.git"). Where would
> Alice write the list of commits she pulled so that when Bob later runs
> git, he knows that she has pulled them?
> 
> There is also the issue of "dumb" transports in which no git code is
> running on the remote repo at all (e.g., Alice fetches from Bob via dumb
> http; Bob's server doesn't even have git at all).
> 
> There may be clever or complex ideas to tackle those problems, but I
> suspect that handling push would cover most practical cases (e.g., in
> the dumb http case, Bob's commits probably ended up on the server via
> push). So perhaps it is a good place to start.

Fair points. Honestly, I was thinking more about a developer pulling changes between locations within his or her control, say a laptop and a desktop, or simply between multiple clones on the same machine. In this scenario, it would be useful to warn the developer that: "[some of] the commits you are about to rewrite, while not in the upstream repository, have been replicated into other repositories within your control. These other repositories will not be rewritten. Are you sure you want to continue?"

davez

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFD] Rewriting safety - warn before/when rewriting published history
  2012-02-11 13:10                         ` Johan Herland
  2012-02-11 13:45                           ` Jakub Narebski
@ 2012-04-07 15:01                           ` Steven Michalske
  1 sibling, 0 replies; 34+ messages in thread
From: Steven Michalske @ 2012-04-07 15:01 UTC (permalink / raw)
  To: Johan Herland; +Cc: Jakub Narebski, Philip Oakley, git


On Feb 11, 2012, at 9:10 PM, Johan Herland wrote:

> On Fri, Feb 10, 2012 at 20:38, Jakub Narebski <jnareb@gmail.com> wrote:
>> On Tue, 7 Feb 2012, Johan Herland wrote:
>>> On Tue, Feb 7, 2012 at 15:31, Jakub Narebski <jnareb@gmail.com> wrote:
>>> I am unsure whether the 'secret'-ness of a commit should follow across
>>> the push, but if you do (assuming we store the 'secret' flag using
>>> git-notes) this is simply a matter of synchronizing the
>>> refs/notes/secret to the same remote.
>> 
>> I think it should, so that 'secret' commit would not escape by accident
>> via a group secret repository.
>> 
>> What makes it hard (I think) is that we would prefer to transfer
>> 'secret'-ness only for pushed commits.  That might be problem for notes
>> based implementation of 'secret' annotation and 'secret'-ness transfer...
>> though I guess knowing that there exist 'secret' commit with given SHA1
>> which we do not have and should not have is not much breach of
>> confidentiality.  Still...
> 
> If you don't want to transfer all of refs/notes/secret, you would
> probably have to extend the git protocol with a per-commit 'secret'
> flag (which would then be applied to the receiving repo's
> refs/notes/secret).

Implementing these as bi-directional transfer of flag attributes might be a good working concept.
This way we could implement the public flag and later add the secret flag, and later add the foo flag.

I say bidirectional because if Tom and mary are working on a group project.  Mary publishes a commit to the public repo.
When Tom pulls from Mary he should get the update for the flags that Mary published.  If Mary pushes to Tom's repo it should update Tom's flags as well.

> 
> Still, this is all specific to the 'secret' feature, which IMHO is
> much less important then the 'public' feature. Implementing the
> barebones 'public' feature (i.e. refuse rewrite of commits reachable
> from upstream) is much less work, and would be enough for 90% of git
> users, I believe.
> 
> 

There are two kinds of pushes.  Those to public facing repositories and those to a private working repository.  Like a build bot repo.  Pushes to that repo might not want to mark a commit as public.

Steve





> ...Johan
> 
> -- 
> Johan Herland, <johan@herland.net>
> www.herland.net
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFD] Rewriting safety - warn before/when rewriting published history
  2012-02-07 15:09                     ` Johan Herland
  2012-02-10 19:38                       ` Jakub Narebski
@ 2012-04-07 14:49                       ` Steven Michalske
  1 sibling, 0 replies; 34+ messages in thread
From: Steven Michalske @ 2012-04-07 14:49 UTC (permalink / raw)
  To: Johan Herland; +Cc: Jakub Narebski, Philip Oakley, git


On Feb 7, 2012, at 11:09 PM, Johan Herland wrote:

> (we are pretty much in violent agreement, so I will only comment where
> I find it necessary)
> 
> On Tue, Feb 7, 2012 at 15:31, Jakub Narebski <jnareb@gmail.com> wrote:
>> Also, when thinking about different scenarios of why one would like to
>> mark commit as 'secret', we might want to be able to mark commit as
>> secret / unpublishable with respect to _subset_ of remotes, so e.g.
>> I am prevented from accidentally publishing commits marked as 'secret'
>> to public repository, or to CI/QA repository, but I can push (perhaps
>> with warning) to group repository, together with 'secret'-ness state
>> of said commit...
>> 
>> ... though it wouldn't be as much 'secret' as 'confidential' ;-)
> 
> Another way to achieve this would be to have a config flag to control
> whether Git checks for the 'secret' flag before pushing. This config
> flag could be set at the system/user level (to enable/disable the
> feature as a whole), at the repo level (to enable/disable it in a
> given repo), at the remote level (to enable/disable it on a given
> repo), and finally at the branch level (to enable-disable it for a
> given branch (and its upstream)). Thus you could have a .git/config
> that looked like this:
> 
>  [core]
>  refusePushSecret = true
> 
>  [remote "foo"]
>  refusePushSecret = false
>  url = ...
>  fetch = ...
> 
>  [branch "baz"]
>  remote = foo
>  merge = refs/heads/baz
>  refusePushSecret = true
> 
> This config would:
> 
> - refuse to push 'secret' commits from branch 'baz'
> (branch.baz.refusePushSecret == true)
> 
> - but allow to push other branches with 'secret' commits to remote
> 'foo' (remote.foo.refusePushSecret == false)
> 
> - but refuse to push 'secret' commits to other remotes
> (core.refusePushSecret == true)
> 
> (The order of precedence would be: branch config > remote config >
> repo config > user config > system config > default when unset)
> 
> I am unsure whether the 'secret'-ness of a commit should follow across
> the push, but if you do (assuming we store the 'secret' flag using
> git-notes) this is simply a matter of synchronizing the
> refs/notes/secret to the same remote.
> 
I think this would allow teamwork on selected remotes!

Though after reading all the discussion on this thread I now feel that there are some different forms of secret.

Stuff you want to only exist on your repository, hacks, junk and stuff.
or
Material that should not go to your public facing repository.
or
A commit containing a file that is not for public consumption, but something you want tracked in the same repo.
This might be an incompatible concept; I cope with this by using a git submodule called internal in the code base, it works well.

> 
> Have fun! :)
> 
> ...Johan
> 
> -- 
> Johan Herland, <johan@herland.net>
> www.herland.net
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFD] Rewriting safety - warn before/when rewriting published history
  2012-02-07 14:31                   ` Jakub Narebski
  2012-02-07 15:09                     ` Johan Herland
@ 2012-02-07 17:27                     ` Ronan Keryell
  1 sibling, 0 replies; 34+ messages in thread
From: Ronan Keryell @ 2012-02-07 17:27 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

>>>>> On Tue, 7 Feb 2012 15:31:07 +0100, Jakub Narebski <jnareb@gmail.com> said:

    >> Agreed, but AFAICS (and modulo the addition of pre-rewrite and
    >> pre/post-push hooks mentioned earlier) all of the things
    >> discussed so far in this thread can be implemented as hooks.

    Jakub> That would be nice.

    Jakub> And the new hooks (pre-rewrite, pre/post-push) would be
    Jakub> useful anyway, I think.

Yes, to deal with file metadata in git for example. :-)
-- 
  Ronan KERYELL                      |\/  Phone:  +1 408 844 4720
  Wild Systems / HPC Project, Inc.   |/)  Cell:   +33 613 143 766
  5201 Great America Parkway #3241   K    Ronan.Keryell@hpc-project.com
  Santa Clara, CA 95054              |\   skype:keryell
  USA                                | \  http://hpc-project.com

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFD] Rewriting safety - warn before/when rewriting published history
  2012-02-04 19:45 [RFD] Rewriting safety - warn before/when rewriting published history Jakub Narebski
  2012-02-05 14:33 ` Ben Walton
       [not found] ` <CAFA910035B74E56A52A96097E76AC39@PhilipOakley>
@ 2012-02-06  0:57 ` Steven Michalske
  2012-02-06  6:53   ` Johan Herland
  2012-02-06 13:45   ` Jakub Narebski
  2 siblings, 2 replies; 34+ messages in thread
From: Steven Michalske @ 2012-02-06  0:57 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

See inlined responses below.

On Feb 4, 2012, at 11:45 AM, Jakub Narebski wrote:

> So people would like for git to warn them about rewriting history before 
> they attempt a push and it turns out to not fast-forward.
> 

I like this idea and I encounter this issue with my co-workers new to git.
It scares them thinking they broke the repository.

> In Mercurial 2.1 there are three available phases: 'public' for
> published commits, 'draft' for local un-published commits and
> 'secret' for local un-published commits which are not meant to
> be published.
> 
> The phase of a changeset is always equal to or higher than the phase
> of it's descendants, according to the following order:
> 
>      public < draft < secret

Let's not limit ourselves to just three levels.  They are a great start but I propose the following.

published - The commits that are on a public repository that if are rewritten will invoke uprisings.
	general rule here would be to revert or patch, no rewrites.
based - The commits that the core developers have work based upon. (not just the commits in their repo.)
	general rule is notify your fellow developers before a rewrite.
shared - The commits that are known to your fellow core developers.
	These commits are known, but have not had work based off of them.  Minimal risk to rewrite.
local - The commits that are local only, no one else has a copy.
	Commits your willing to share, but have not been yet shared, either from actions of you, or a fetch from others.
restricted or private - The commits that you do not want shared.
	Manually added, think of a branch tip marked as restricted automatically promotes commits to the branch as restricted.

Maybe make these like nice levels, but as two components, publicity 0-100 and rewritability 0-100
	Published is publicity 100 and rewritability 0
	Restricted is publicity 0 and rewritability 100
	Based publicity 75 and rewritability 25
	Shared publicity 50 and rewritability 50
	Local publicity 25 and rewritability 75
	Restricted publicity 0 and rewritability 100

Other option are flags stating if the commit is published, based, shared, or restricted.
You could have a published and based commit that is more opposed to rewrite than a public commit.

Call security on a published restricted commit ;-)

Commits are by default local.

Commits are published when they are pushed or fetched and merged to a publishing branch of a repository.
	On fetch/merge a post merge hook should send back a note to the remote repository that the commits were published.

Restricted commits/branches/tags should not be made public, error out and require clearing of the attribute or a --force-restricted option that automatically removes the restricted attribute.  They are at least promoted to shared, if not published.

Based is only used in situations where you have developers sharing amongst their repositories, and you want a rule that is less restrictive than no rewrites.

Shared is what we have now when a commit is in a remote repository without the no rewrite options. e.g. receive.denyNonFastForwards.

As it stands now we can infer local and shared,  we need metadata to know when a commit is made based, published, or restricted.


Using the nomenclature from Mercurial 
>      public < draft < secret

public -> publicity 100, rewritability 0
draft -> publicity ?, rewritability 50
secret -> publicity 0, rewritability 100

Steve

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFD] Rewriting safety - warn before/when rewriting published history
  2012-02-06  0:57 ` Steven Michalske
@ 2012-02-06  6:53   ` Johan Herland
  2012-02-06 13:45   ` Jakub Narebski
  1 sibling, 0 replies; 34+ messages in thread
From: Johan Herland @ 2012-02-06  6:53 UTC (permalink / raw)
  To: Steven Michalske; +Cc: Jakub Narebski, git

On Mon, Feb 6, 2012 at 01:57, Steven Michalske <smichalske@gmail.com> wrote:
> On Feb 4, 2012, at 11:45 AM, Jakub Narebski wrote:
>> In Mercurial 2.1 there are three available phases: 'public' for
>> published commits, 'draft' for local un-published commits and
>> 'secret' for local un-published commits which are not meant to
>> be published.
>>
>> The phase of a changeset is always equal to or higher than the phase
>> of it's descendants, according to the following order:
>>
>>      public < draft < secret
>
> Let's not limit ourselves to just three levels.  They are a great start but I propose the following.
>
> published - The commits that are on a public repository that if are rewritten will invoke uprisings.
>        general rule here would be to revert or patch, no rewrites.
> based - The commits that the core developers have work based upon. (not just the commits in their repo.)
>        general rule is notify your fellow developers before a rewrite.
> shared - The commits that are known to your fellow core developers.
>        These commits are known, but have not had work based off of them.  Minimal risk to rewrite.
> local - The commits that are local only, no one else has a copy.
>        Commits your willing to share, but have not been yet shared, either from actions of you, or a fetch from others.
> restricted or private - The commits that you do not want shared.
>        Manually added, think of a branch tip marked as restricted automatically promotes commits to the branch as restricted.
>
> Maybe make these like nice levels, but as two components, publicity 0-100 and rewritability 0-100
>        Published is publicity 100 and rewritability 0
>        Restricted is publicity 0 and rewritability 100
>        Based publicity 75 and rewritability 25
>        Shared publicity 50 and rewritability 50
>        Local publicity 25 and rewritability 75
>        Restricted publicity 0 and rewritability 100
>
> [...]

With all due respect, I believe this is crazy. You're adding an entire
layer of complexity on top of commits that every user has to know
about, and has little or no value to most of them. IMHO, most users
only want Git to help them avoid doing something stupid (rewriting
'public' commits or publishing 'secret' commits), and to do so with
the minimal amount of manual user interaction. The above idea is more
suitable for armchair dictators that want to micromanage their commits
along two arbitrary axes of evil^H^H^H^Hpointlessness. On both axes,
you'll need threshold values where Git starts refusing to
publish/rewrite your commit. Hence, the only thing that matters is
whether the 'publicity'/'rewritability' value is above/below that
threshold, at which point you could save yourself a lot of trouble by
making them simple boolean flags instead.

Having said that, you can use 'git notes' along with existing and
proposed hooks (as described elsewhere in this thread) to implement
whatever crazy commit publishing/rewriting scheme you desire. To
misquote someone famous: I disapprove of what you want to do with Git,
but I will defend to the death your right to make Git do what you want
(in the privacy of your own repos). ;)

Have fun! :)

...Johan

-- 
Johan Herland, <johan@herland.net>
www.herland.net

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFD] Rewriting safety - warn before/when rewriting published history
  2012-02-06  0:57 ` Steven Michalske
  2012-02-06  6:53   ` Johan Herland
@ 2012-02-06 13:45   ` Jakub Narebski
  2012-04-07 14:36     ` Steven Michalske
  1 sibling, 1 reply; 34+ messages in thread
From: Jakub Narebski @ 2012-02-06 13:45 UTC (permalink / raw)
  To: Steven Michalske; +Cc: git

On Mon, 6 Feb 2012, Steven Michalske wrote:

> See inlined responses below.

Is this comment necessary at all?

> On Feb 4, 2012, at 11:45 AM, Jakub Narebski wrote:
> 
> > So people would like for git to warn them about rewriting history before 
> > they attempt a push and it turns out to not fast-forward.
> > 
> 
> I like this idea and I encounter this issue with my co-workers new to git.
> It scares them thinking they broke the repository.

It is true that while this feature would be useful also for "power
users", it would be most helpful for newbies (users new to git).

So I am afraid that implementing it with example hooks that must be
turned on explicitly might be not enough...

> > In Mercurial 2.1 there are three available phases: 'public' for
> > published commits, 'draft' for local un-published commits and
> > 'secret' for local un-published commits which are not meant to
> > be published.
> > 
> > The phase of a changeset is always equal to or higher than the phase
> > of it's descendants, according to the following order:
> > 
> >      public < draft < secret
> 
> Let's not limit ourselves to just three levels.  They are a great start
> but I propose the following. 

As we don't have any implementation, I'd rather we don't multiply entities.
I was even thinking about limiting to just 'public' and 'draft' "phases".

> published - The commits that are on a public repository that if are
>       rewritten will invoke uprisings. general rule here would be
>       to revert or patch, no rewrites. 
> based - The commits that the core developers have work based upon.
>       (not just the commits in their repo.) 
> 	general rule is notify your fellow developers before a rewrite.
> shared - The commits that are known to your fellow core developers.
> 	These commits are known, but have not had work based off of them.
> 	Minimal risk to rewrite. 

All these are very fairly nuanced, with minuscule differences between
them.  I'd rather not multiply entities, especially not introduce such
hard to guess what it about from their name.

In Mercurial phases share hierarchy of traits:
http://mercurial.selenic.com/wiki/Phases

           |        traits       |
           .......................  
           | immutable | shared  |
 ----------+-----------+---------+
 public    | x         | x       | ^
 draft     |           | x       | ^
 secret    |           |         | ^

The names of those traits probably should be changed in Git.

Those traits are boolean in Mercurial, but I think we can implement
what you would like to have to change them to tristate: 'deny' (unless
forced, i.e. the same as true), 'warn', 'ignore' (i.e. the same as false).

I think that it would be nice to be able to tune "severity" of trait
on per-remote and/or per-branch basis.  This way you would get warned
before rewriting commits that were pushed to your group repository,
and prevented from rewriting commits that are present in projects public
repository.

Nevertheless I think it is something better left for later, and added
only if it turns out to be really needed.

> local - The commits that are local only, no one else has a copy.
> 	Commits your willing to share, but have not been yet shared,
> 	either from actions of you, or a fetch from others. 

That's Mercurial's 'draft' phase.

> restricted or private - The commits that you do not want shared.
> 	Manually added, think of a branch tip marked as restricted
> 	automatically promotes commits to the branch as restricted. 

That's Mercurial 'secret' phase.

> Maybe make these like nice levels, but as two components,
> publicity 0-100 and rewritability 0-100 
> 	Published is publicity 100 and rewritability 0
> 	Restricted is publicity 0 and rewritability 100
> 	Based publicity 75 and rewritability 25
> 	Shared publicity 50 and rewritability 50
> 	Local publicity 25 and rewritability 75
> 	Restricted publicity 0 and rewritability 100

Continuous traits are IMHO a bad idea.  You would have to quantize them
and turn them on into specific behavior: ignore, warn, deny.

For example WTF does 25 "publicity" (bad name) or "rewritability" actually
means in term of git behavior, eh?

> Other option are flags stating if the commit is published, based,
> shared, or restricted.  You could have a published and based commit
> that is more opposed to rewrite than a public commit. 
> 
> Call security on a published restricted commit ;-)

Please note that while "phases" look like they are trait of individual
commits, they are in fact artifact of revision walking.  The idea is
that ancestors of 'private' commit can be 'private', 'draft' or 'public',
that ancestors of 'draft' commit are 'draft' or 'public', and that _all_
ancestors of 'public' commit are 'public'.

> Commits are by default local.

This 'by default' needs to be specified further, because for example
all commits in freshly cloned repository should be in 'public' phase
by default.  

Also, don't say 'commits are local', 'commits are published'; use "phases"
nomenclature (at least until we invent something so much better that it
is worth breaking consistency with Mercurial terminology).
> 
> Commits are published when they are pushed or fetched and merged to
> a publishing branch of a repository. 

BTW. I am not sure if pushing to remote repository updates (or can update)
any remote-tracking branches...

> 	On fetch/merge a post merge hook should send back a note to
> 	the remote repository that the commits were published. 

I think this is unnecessary in the "best practices" scenario, where each
user has separate private repository where he/she does his/her work, and
one's own public repository, where people fetch from.  He/she can push
to some shared repository, and that has to be supported too.

Though there is mothership/ sattellite situation, where you can pull and
push only from one direction.  There we might want for some way to notify
that some commits were fetched and should now be considered 'public'.
Though I am not sure if it is really necessary.

> Restricted commits/branches/tags should not be made public, error out and
> require clearing of the attribute or a --force-restricted option that
> automatically removes the restricted attribute.  They are at least promoted
> to shared, if not published.   

Or just skip them (silently or not) if we push using globbing refspec, and
glob matches some refs marked as 'private'.

> Based is only used in situations where you have developers sharing amongst
> their repositories, and you want a rule that is less restrictive than
> no rewrites.  

Multiplying entities.

> Shared is what we have now when a commit is in a remote repository without
> the no rewrite options. e.g. receive.denyNonFastForwards. 

Multiplying entities.

[...]
> > Using the nomenclature from Mercurial 
> >      public < draft < secret
> 
> public -> publicity 100, rewritability 0
> draft -> publicity ?, rewritability 50
> secret -> publicity 0, rewritability 100

That doesn't really help, at all.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFD] Rewriting safety - warn before/when rewriting published history
  2012-02-06 13:45   ` Jakub Narebski
@ 2012-04-07 14:36     ` Steven Michalske
  0 siblings, 0 replies; 34+ messages in thread
From: Steven Michalske @ 2012-04-07 14:36 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Saw that this was put into GSOC and reread the thread.


On Feb 6, 2012, at 9:45 PM, Jakub Narebski wrote:

> On Mon, 6 Feb 2012, Steven Michalske wrote:
> 
>> See inlined responses below.
> 
> Is this comment necessary at all?

Oddly enough yes.
> 
>> On Feb 4, 2012, at 11:45 AM, Jakub Narebski wrote:
>> 
>>> So people would like for git to warn them about rewriting history before 
>>> they attempt a push and it turns out to not fast-forward.
>>> 
>> 
>> I like this idea and I encounter this issue with my co-workers new to git.
>> It scares them thinking they broke the repository.
> 
> It is true that while this feature would be useful also for "power
> users", it would be most helpful for newbies (users new to git).
> 
> So I am afraid that implementing it with example hooks that must be
> turned on explicitly might be not enough...
> 
>>> In Mercurial 2.1 there are three available phases: 'public' for
>>> published commits, 'draft' for local un-published commits and
>>> 'secret' for local un-published commits which are not meant to
>>> be published.
>>> 
>>> The phase of a changeset is always equal to or higher than the phase
>>> of it's descendants, according to the following order:
>>> 
>>>     public < draft < secret
>> 
>> Let's not limit ourselves to just three levels.  They are a great start
>> but I propose the following. 
> 
> As we don't have any implementation, I'd rather we don't multiply entities.
> I was even thinking about limiting to just 'public' and 'draft' "phases".
> 
>> published - The commits that are on a public repository that if are
>>      rewritten will invoke uprisings. general rule here would be
>>      to revert or patch, no rewrites. 
>> based - The commits that the core developers have work based upon.
>>      (not just the commits in their repo.) 
>> 	general rule is notify your fellow developers before a rewrite.
>> shared - The commits that are known to your fellow core developers.
>> 	These commits are known, but have not had work based off of them.
>> 	Minimal risk to rewrite. 
> 
> All these are very fairly nuanced, with minuscule differences between
> them.  I'd rather not multiply entities, especially not introduce such
> hard to guess what it about from their name.
> 
Fair enough, the intent was to get folks thinking about where they could go.

As for published, it carries a higher meaning than public.  For example I have a public repo, and all the commits are public.
but if I make a notice that 1 of the branches "in_progress" is often rebased.... (I know of one stacked git user that does this)
 and that the master and dev branches are never going to be rewritten.  So master and dev are Published and Public, where "in_progress" is not.

So it would be nice that git would warn me if I based a branch off of a non published branch.

> In Mercurial phases share hierarchy of traits:
> http://mercurial.selenic.com/wiki/Phases
> 
>           |        traits       |
>           .......................  
>           | immutable | shared  |
> ----------+-----------+---------+
> public    | x         | x       | ^
> draft     |           | x       | ^
> secret    |           |         | ^

Not sure I agree with shared name,  is shared the same as sharable?

> 	
> The names of those traits probably should be changed in Git.
> 
> Those traits are boolean in Mercurial, but I think we can implement
> what you would like to have to change them to tristate: 'deny' (unless
> forced, i.e. the same as true), 'warn', 'ignore' (i.e. the same as false).
> 
> I think that it would be nice to be able to tune "severity" of trait
> on per-remote and/or per-branch basis.  This way you would get warned
> before rewriting commits that were pushed to your group repository,
> and prevented from rewriting commits that are present in projects public
> repository.
> 
> Nevertheless I think it is something better left for later, and added
> only if it turns out to be really needed.

There was a good comment on this in the thread.
> 
>> local - The commits that are local only, no one else has a copy.
>> 	Commits your willing to share, but have not been yet shared,
>> 	either from actions of you, or a fetch from others. 
> 
> That's Mercurial's 'draft' phase.
> 
>> restricted or private - The commits that you do not want shared.
>> 	Manually added, think of a branch tip marked as restricted
>> 	automatically promotes commits to the branch as restricted. 
> 
> That's Mercurial 'secret' phase.


> 
> 
>> Maybe make these like nice levels, but as two components,
>> publicity 0-100 and rewritability 0-100 
>> 	Published is publicity 100 and rewritability 0
>> 	Restricted is publicity 0 and rewritability 100
>> 	Based publicity 75 and rewritability 25
>> 	Shared publicity 50 and rewritability 50
>> 	Local publicity 25 and rewritability 75
>> 	Restricted publicity 0 and rewritability 100
> 
> Continuous traits are IMHO a bad idea.  You would have to quantize them
> and turn them on into specific behavior: ignore, warn, deny.

Continuous vs enumerated vs boolean,  I'd like to see enumerated honestly.
I proposed continuous to get the idea out there.

> 
> For example WTF does 25 "publicity" (bad name) or "rewritability" actually
> means in term of git behavior, eh?

Yea, a set of flags is a better idea.

> 
>> Other option are flags stating if the commit is published, based,
>> shared, or restricted.  You could have a published and based commit
>> that is more opposed to rewrite than a public commit. 
>> 
>> Call security on a published restricted commit ;-)
> 
> Please note that while "phases" look like they are trait of individual
> commits, they are in fact artifact of revision walking.  The idea is
> that ancestors of 'private' commit can be 'private', 'draft' or 'public',
> that ancestors of 'draft' commit are 'draft' or 'public', and that _all_
> ancestors of 'public' commit are 'public'.
> 
>> Commits are by default local.
> 
> This 'by default' needs to be specified further, because for example
> all commits in freshly cloned repository should be in 'public' phase
> by default.  

New commits
> 
> Also, don't say 'commits are local', 'commits are published'; use "phases"
> nomenclature (at least until we invent something so much better that it
> is worth breaking consistency with Mercurial terminology).

Sorry, I didn't like their terminology for my expanded concept, that allowed co-devlopers to work on secret work together.

Pretend we are co-workers on super cool new feature.
If you and I are working on a project at work, I can share my secrets with you, but I can't push it to the public servers yet.

Another concept that is similar to secret is internal code.
Working on code that has public source and source that is for inside your company.
	Though this is more an attribute of a file.
	Warn about commits with internal files being pushed to an external repository. 

>> 
>> Commits are published when they are pushed or fetched and merged to
>> a publishing branch of a repository. 
> 
> BTW. I am not sure if pushing to remote repository updates (or can update)
> any remote-tracking branches...
> 
>> 	On fetch/merge a post merge hook should send back a note to
>> 	the remote repository that the commits were published. 
> 
> I think this is unnecessary in the "best practices" scenario, where each
> user has separate private repository where he/she does his/her work, and
> one's own public repository, where people fetch from.  He/she can push
> to some shared repository, and that has to be supported too.
> 
> Though there is mothership/ sattellite situation, where you can pull and
> push only from one direction.  There we might want for some way to notify
> that some commits were fetched and should now be considered 'public'.
> Though I am not sure if it is really necessary.

I don't work in the environment of the best practices.  I most often work with a central authoritative repo.

> 
>> Restricted commits/branches/tags should not be made public, error out and
>> require clearing of the attribute or a --force-restricted option that
>> automatically removes the restricted attribute.  They are at least promoted
>> to shared, if not published.   
> 
> Or just skip them (silently or not) if we push using globbing refspec, and
> glob matches some refs marked as 'private'.

Agreed,  explicit pushes were what I was referring to.
> 
>> Based is only used in situations where you have developers sharing amongst
>> their repositories, and you want a rule that is less restrictive than
>> no rewrites.  
> 
> Multiplying entities.
> 
>> Shared is what we have now when a commit is in a remote repository without
>> the no rewrite options. e.g. receive.denyNonFastForwards. 
> 
> Multiplying entities.
> 
> [...]
>>> Using the nomenclature from Mercurial 
>>>     public < draft < secret
>> 
>> public -> publicity 100, rewritability 0
>> draft -> publicity ?, rewritability 50
>> secret -> publicity 0, rewritability 100
> 
> That doesn't really help, at all.
> 
> -- 
> Jakub Narebski
> Poland

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2012-04-07 15:01 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-02-04 19:45 [RFD] Rewriting safety - warn before/when rewriting published history Jakub Narebski
2012-02-05 14:33 ` Ben Walton
2012-02-05 15:05   ` Jakub Narebski
     [not found] ` <CAFA910035B74E56A52A96097E76AC39@PhilipOakley>
2012-02-05 16:15   ` Jakub Narebski
2012-02-05 17:29     ` Johan Herland
2012-02-05 20:46       ` Jakub Narebski
2012-02-05 22:49         ` Johan Herland
2012-02-06 14:44           ` Jakub Narebski
2012-02-06 15:59             ` Johan Herland
2012-02-06 17:14               ` Jakub Narebski
2012-02-06 20:16                 ` Johan Herland
2012-02-07 14:31                   ` Jakub Narebski
2012-02-07 15:09                     ` Johan Herland
2012-02-10 19:38                       ` Jakub Narebski
2012-02-10 20:19                         ` Philip Oakley
2012-02-11 13:10                         ` Johan Herland
2012-02-11 13:45                           ` Jakub Narebski
2012-02-20 21:07                             ` [RFC] pre-rebase: Refuse to rewrite commits that are reachable from upstream Johan Herland
2012-02-20 21:21                               ` Johan Herland
2012-02-20 22:43                               ` Junio C Hamano
2012-02-21  0:03                                 ` Johan Herland
2012-02-21  7:44                                   ` Junio C Hamano
2012-02-21 23:23                                     ` Johan Herland
2012-02-21 23:43                                       ` Junio C Hamano
2012-02-21 23:59                                         ` Dave Zarzycki
2012-02-22  7:09                                           ` Jeff King
2012-02-22  8:00                                             ` Dave Zarzycki
2012-04-07 15:01                           ` [RFD] Rewriting safety - warn before/when rewriting published history Steven Michalske
2012-04-07 14:49                       ` Steven Michalske
2012-02-07 17:27                     ` Ronan Keryell
2012-02-06  0:57 ` Steven Michalske
2012-02-06  6:53   ` Johan Herland
2012-02-06 13:45   ` Jakub Narebski
2012-04-07 14:36     ` Steven Michalske

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).