Re: Locking binary files

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: Locking binary files
       [not found] <94c1db200809222333q4953a6b9g8ce0c1cd4b8f5eb4@mail.gmail.com>
@ 2008-09-23  6:39 ` Mario Pareja
  2008-09-23  7:18   ` Andreas Ericsson
                     ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Mario Pareja @ 2008-09-23  6:39 UTC (permalink / raw)
  To: git

Hi,

For one and a half years, I have been keeping my eyes on the git
community in hopes of making the switch away from SVN.  One particular
issue holding me back is the inability to lock binary files.
Throughout the past year, I have yet to see developments on this
issue.  I understand that locking files goes against the fundamental
principles of distributed source control, but I think we need to come
up with some workarounds.  For Linux kernel development this is may
not be an issue; however, for application development this is a major
issue. How else can one developer be sure that time spent editing a
binary file will not be wasted because another developer submitted a
change?

To achieve the effects of locking, a "central" repository must be
identified.  Regardless of the distributed nature of git, most
_companies_ will have a "central" repository for a software project.
We should be able to mark a file as requiring a lock from the
governing git repository at a specified address.  Is this made
difficult because git tracks file contents not files?

In any case, I think this is a crucial issue that needs to be
addressed if git is going to be adopted by companies with binary file
conflict potential. I don't see how a web development company can take
advantage of git to track source code and image file changes.  Any
advice would be great!

Regards,

Mario

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Locking binary files
  2008-09-23  6:39 ` Locking binary files Mario Pareja
@ 2008-09-23  7:18   ` Andreas Ericsson
       [not found]     ` <94c1db200809230054t20e7e61dh5022966d4112eee6@mail.gmail.com>
  2008-09-23 11:16   ` Boaz Harrosh
  2008-09-23 13:44   ` Dmitry Potapov
  2 siblings, 1 reply; 19+ messages in thread
From: Andreas Ericsson @ 2008-09-23  7:18 UTC (permalink / raw)
  To: Mario Pareja; +Cc: git

Mario Pareja wrote:
> Hi,
> 
> For one and a half years, I have been keeping my eyes on the git
> community in hopes of making the switch away from SVN.  One particular
> issue holding me back is the inability to lock binary files.
> Throughout the past year, I have yet to see developments on this
> issue.  I understand that locking files goes against the fundamental
> principles of distributed source control, but I think we need to come
> up with some workarounds.  For Linux kernel development this is may
> not be an issue; however, for application development this is a major
> issue. How else can one developer be sure that time spent editing a
> binary file will not be wasted because another developer submitted a
> change?
> 

Because they will cause merge conflicts when you try to bring the
histories together. Some binary formats can be edited by multiple
users at the same time, while others can't, so git will try to merge
those binary files for you. For images, that almost certainly won't
go so well so it will result in a conflict.

> To achieve the effects of locking, a "central" repository must be
> identified.

To achieve distributedness no central repository must exist. Locking
can be done by some other means.

>  Regardless of the distributed nature of git, most
> _companies_ will have a "central" repository for a software project.

Actually, all projects with some sort of userbase will probably have
some official "here's the published code suitable for production use"
repository. To say that it's the "central" one is a bit off though.
It's merely a public place that can be referred to for convenience.

> We should be able to mark a file as requiring a lock from the
> governing git repository at a specified address.  Is this made
> difficult because git tracks file contents not files?
> 
> In any case, I think this is a crucial issue that needs to be
> addressed if git is going to be adopted by companies with binary file
> conflict potential. I don't see how a web development company can take
> advantage of git to track source code and image file changes.  Any
> advice would be great!
> 

Try and find out.

mkdir foo && cd foo && git init
cp /random/binary/file.png image.png
git add image.png && git commit -m"first commit"
git checkout -b A
cp /other/random/binary/file.png image.png
git add image.png && git commit -m"conflicting commit"
git checkout -b B master
cp /third/random/binary/file.png image.png
git add image.png && git commit -m"non-conflicting commit"
git checkout master
cp /third/random/binary/file.png image.png
git add image.png && git commit -m"master says 'so be it'"

git merge B; # works, since the binary files are the same
git merge A; # produces a conflict message


In which way is that not exactly the right behaviour?
How would locking have helped?

If your colleagues are replacing files you committed so
that your code suddenly fails, you have a communication
(and QA) issue at work. Adding locking to git is not the
solution to that problem. Introducing a sort of builtin
notion of a central repository is, frankly, disgusting.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 19+ messages in thread

[parent not found: <94c1db200809230054t20e7e61dh5022966d4112eee6@mail.gmail.com>]

* Re: Locking binary files
       [not found]     ` <94c1db200809230054t20e7e61dh5022966d4112eee6@mail.gmail.com>
@ 2008-09-23  8:31       ` Andreas Ericsson
  2008-09-23 13:56         ` Mario Pareja
  0 siblings, 1 reply; 19+ messages in thread
From: Andreas Ericsson @ 2008-09-23  8:31 UTC (permalink / raw)
  To: Mario Pareja, Git Mailing List

Mario, please don't reply in private. That way your mails won't
get indexed and you don't have a chance to get help from others
on the mailing list.

While we're at it; don't top-post. Most people who frequent email
lists with moderate to high traffic read hundreds of emails every
day, so a quick reminder of what the discussion was about is useful
when getting a reply. That reminder gets a lot trickier to get to
if you first have to scroll down and then back up. Besides that,
it feels totally backwards.

Mario Pareja wrote:
> Andreas,
> 
> Thanks for the quick reply.  You asked how I thought locking could
> have helped. I think locking helps notify a developer that a file is
> being modified _before_ the developer begins his/her own
> modifications. If I followed your example correctly, the conflict is
> identified after the work has been done - this is too late if you ask
> me.
> 

So it's a communication issue then. The way I understand locks in svn
and cvs is that they also only bother you when you want to check in the
file you've just recently modified, or if multiple people want to lock
the same file at the same time.

If that's the case, I see no problem what so ever with teaching specific
git commands to interact with a locking server. git lock (and git unlock)
would have to be coupled with a git-lock-daemon with wich everyone
communicates. It should probably have the ability to run a hook or
something (centrally) when a lock is obtained and released, so as to be
able to notify others that a lock is held.

I might write this for fun some day, but it's really not my itch to
scratch, and it would be a terrible mistake to add something like a
central repository to take care of it when a single rather stupid
daemon and an equally stupid program could do the same work but much
more efficiently.

Note that locking would be completely advisory though, and nothing
would prevent people from committing changes to a locked file. Then
again, insofar as I understand SVN/CVS locking, that's how those
work too, except that an SVN "checkin" would be the equivalent of
"git commit && git push" (the push part of the git sequence won't
work).

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Locking binary files
  2008-09-23  8:31       ` Andreas Ericsson
@ 2008-09-23 13:56         ` Mario Pareja
  2008-09-23 14:28           ` Alex Riesen
                             ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Mario Pareja @ 2008-09-23 13:56 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Git Mailing List

> So it's a communication issue then.

Yes, but I think the communication of this information needs to happen
as part of a developers normal work-flow rather than requiring them to
remember to check an external system.

> The way I understand locks in svn
> and cvs is that they also only bother you when you want to check in the
> file you've just recently modified, or if multiple people want to lock
> the same file at the same time.

The SVN client will make locked files read-only until a lock is
obtained for them.  This helps "remind" you that a lock should be
obtained before editing such a file. Requiring the developer to obtain
a lock ensures that nobody else is editing the file and prevents
wasted work.  Upon commit, the file is marked as unlocked and the
local file is once again read-only.

>
> Note that locking would be completely advisory though, and nothing
> would prevent people from committing changes to a locked file.

If git were to support locking then it could prevent people from
committing without first locking.  Even if it is not supported
directly by git - perhaps using a lock daemon - a wrapper would need
to be written around git commit/push to prevent developers from
committing/pushing changes that would cause binary merging conflicts.

> Then
> again, insofar as I understand SVN/CVS locking, that's how those
> work too, except that an SVN "checkin" would be the equivalent of
> "git commit && git push" (the push part of the git sequence won't
> work).
>

Generally in SVN you need to lock the file before being able to commit.

Really, I am just curious about how others deal with this issue.  Do
you simply start editing binary files and hope nobody else edits the
same file?  Do you send out an email telling people you are working on
such a file?

Mario

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Locking binary files
  2008-09-23 13:56         ` Mario Pareja
@ 2008-09-23 14:28           ` Alex Riesen
  2008-09-23 17:32           ` Daniel Barkalow
  2008-09-23 20:46           ` Dmitry Potapov
  2 siblings, 0 replies; 19+ messages in thread
From: Alex Riesen @ 2008-09-23 14:28 UTC (permalink / raw)
  To: Mario Pareja; +Cc: Andreas Ericsson, Git Mailing List

2008/9/23 Mario Pareja <mpareja.dev@gmail.com>:
>> So it's a communication issue then.
>
> Yes, but I think the communication of this information needs to happen
> as part of a developers normal work-flow rather than requiring them to
> remember to check an external system.

Look at pre-receive and update hooks. They can deny a push operation and
get enough information to notice a change to the path of your unlucky file.

And yes, *you* have to do that yourself.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Locking binary files
  2008-09-23 13:56         ` Mario Pareja
  2008-09-23 14:28           ` Alex Riesen
@ 2008-09-23 17:32           ` Daniel Barkalow
  2008-09-23 19:49             ` Junio C Hamano
  2008-09-23 20:46           ` Dmitry Potapov
  2 siblings, 1 reply; 19+ messages in thread
From: Daniel Barkalow @ 2008-09-23 17:32 UTC (permalink / raw)
  To: Mario Pareja; +Cc: Andreas Ericsson, Git Mailing List

On Tue, 23 Sep 2008, Mario Pareja wrote:

> > So it's a communication issue then.
> 
> Yes, but I think the communication of this information needs to happen
> as part of a developers normal work-flow rather than requiring them to
> remember to check an external system.
> 
> > The way I understand locks in svn
> > and cvs is that they also only bother you when you want to check in the
> > file you've just recently modified, or if multiple people want to lock
> > the same file at the same time.
> 
> The SVN client will make locked files read-only until a lock is
> obtained for them.  This helps "remind" you that a lock should be
> obtained before editing such a file. Requiring the developer to obtain
> a lock ensures that nobody else is editing the file and prevents
> wasted work.  Upon commit, the file is marked as unlocked and the
> local file is once again read-only.

I think the right tool on the git side is actually a "smudge/clean" 
script. When you check something out, git converts it from the 
repository-stored form to a working tree form using a script (if there is 
one configured); this could check whether you've got the appropriate lock, 
and make the file unwritable if you don't. Then you have a script that 
gets or releases a lock and sets any write bits on files already checked 
out appropriately. There could also be locking-server magic to detect that 
you've pushed a change and release the lock, telling you so that it makes 
your file unwritable, but that's optional.

(Side note: consider version-specific logos; which lock you need depends 
on which version you're working on, and you may want to pick up locks for 
multiple versions and make changes to each logo, switching between the 
branches, and make sure you can get all the locks before you start 
working on any of the files, despite not having any individual file 
checked out continuously in the process)

> > Note that locking would be completely advisory though, and nothing
> > would prevent people from committing changes to a locked file.
> 
> If git were to support locking then it could prevent people from
> committing without first locking.  Even if it is not supported
> directly by git - perhaps using a lock daemon - a wrapper would need
> to be written around git commit/push to prevent developers from
> committing/pushing changes that would cause binary merging conflicts.

If you've gotten to the point of committing (let alone pushing), and you 
haven't got exclusive access, git should certainly not prevent you; the 
point of the locking is to prevent people from doing work that will be 
wasted, and the work is already done at this point. It's better then to 
actually try the binary merge, which comes down to apologizing profusely 
and then somebody openning the 3 versions (theirs, the other side's, and 
the common ancestor) in their graphics program and modifying the other 
side's to include their change. It wouldn't help anything to prevent 
people from being able to get all of these versions to each other, once 
they're made. It's also helpful to have people commit what they did before 
redoing it, so that they can use it for reference in the process and won't 
lose it.

(Actually, I bet it would be not-too-hard to set up gimp for three-way 
merge of images; open the result file with "theirs" as the contents, and 
open the common ancestor and "yours" as extra layers and set the ancestor 
to negative, and make the user clean up the mess)

On the other hand, the locking server should reject your push if somebody 
else has got the lock, so that the person who editted the file without 
having the lock is the one stuck redoing things.

In any case, the fundamental idea is: (a) you want some server to favor 
people who declare their intent to change something in advance, and give 
all the work of redoing stuff to people who didn't declare their intent in 
advance; and (b) you want to prompt people to declare their intent in case 
they forget.

(a) is a pre-update hook that checks the diffstat against other people's 
locks. (b) is a smudge script that makes files you're supposed to lock and 
haven't a-w. Of course, git doesn't have the code for manipulating a 
per-user set of locks, but it shouldn't be too hard to find some project 
that just does that.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Locking binary files
  2008-09-23 17:32           ` Daniel Barkalow
@ 2008-09-23 19:49             ` Junio C Hamano
  2008-09-23 21:13               ` Daniel Barkalow
  0 siblings, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2008-09-23 19:49 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: Mario Pareja, Andreas Ericsson, Git Mailing List

Daniel Barkalow <barkalow@iabervon.org> writes:

> I think the right tool on the git side is actually a "smudge/clean" 
> script. When you check something out, git converts it from the 
> repository-stored form to a working tree form using a script (if there is 
> one configured); this could check whether you've got the appropriate lock, 
> and make the file unwritable if you don't.

An obvious question is "how would such a script check the lock when you
are 30,000 ft above ground"; in other words, this "locking mechanism"
contradicts the very nature of distributed development theme.  The best
mechanism should always be on the human side.  An SCM auguments
inter-developer communication, but it is not a _substitute_ for
communication.

But if you limit the use case to an always tightly connected environment
(aka "not distributed at all"), I agree the above would be a very
reasonable approach.

Such a setup would need a separate locking infrastructure and an end user
command that grabs the lock and when successful makes the file in the work
tree read/write.  The user butchers the contents after taking the lock,
saves, and then when running "git commit", probably the post-commit hook
would release any relevant locks.

All these can be left outside the scope of git, as they can be hooked into
git with the existing infrastructure. Once a BCP materializes it could be
added to contrib/ just like the "paranoid" update hook.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Locking binary files
  2008-09-23 19:49             ` Junio C Hamano
@ 2008-09-23 21:13               ` Daniel Barkalow
  2008-09-23 21:54                 ` Dmitry Potapov
  0 siblings, 1 reply; 19+ messages in thread
From: Daniel Barkalow @ 2008-09-23 21:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Mario Pareja, Andreas Ericsson, Git Mailing List

On Tue, 23 Sep 2008, Junio C Hamano wrote:

> Daniel Barkalow <barkalow@iabervon.org> writes:
> 
> > I think the right tool on the git side is actually a "smudge/clean" 
> > script. When you check something out, git converts it from the 
> > repository-stored form to a working tree form using a script (if there is 
> > one configured); this could check whether you've got the appropriate lock, 
> > and make the file unwritable if you don't.
> 
> An obvious question is "how would such a script check the lock when you
> are 30,000 ft above ground"; in other words, this "locking mechanism"
> contradicts the very nature of distributed development theme.  The best
> mechanism should always be on the human side.  An SCM auguments
> inter-developer communication, but it is not a _substitute_ for
> communication.

If you're offline, you can't get new locks, nor release them. But it can 
make reasonable decisions if it remembers what locks you got before.

On the other hand, you can just make the file writable yourself while 
disconnected, and nothing bad happens to anybody else; if someone else 
locks the file and starts working, they'll block your eventual push until 
they push and you merge. And nothing too bad happens to you; you get stuck 
redoing the change later (as a merge), but (a) you would have had to do 
the work then anyway; (b) you knew you weren't protecting yourself; and 
(c) at least you got to practice on the plane.

The point of the locking is just that, if you get the lock for a 
particular file in a particular branch on a particular shared repository, 
you can be sure you won't have to merge that file in order to push there, 
and you can get this worked out in advance of having the push ready. A 
secondary concern is that you might want to stop yourself from working on 
certain things without this kind of reservation, but that's a local 
decision.

> But if you limit the use case to an always tightly connected environment
> (aka "not distributed at all"), I agree the above would be a very
> reasonable approach.
> 
> Such a setup would need a separate locking infrastructure and an end user
> command that grabs the lock and when successful makes the file in the work
> tree read/write.  The user butchers the contents after taking the lock,
> saves, and then when running "git commit", probably the post-commit hook
> would release any relevant locks.

The lock needs to last until you push to the repository the lock is for; 
otherwise you have the exclusive ability to make changes, but someone who 
grabs the lock right after you release it will still be working on the 
version without your change, which is what the lock is supposed to 
prevent.

> All these can be left outside the scope of git, as they can be hooked into
> git with the existing infrastructure. Once a BCP materializes it could be
> added to contrib/ just like the "paranoid" update hook.

It would be handy to link against some of git, since it will want to use 
git config files and remotes and refspecs to figure out what lock to ask 
for on the client side, and how to communicate with the target remote 
repository, and the process of getting a lock requires checking that 
you're up-to-date, and git's also got a bunch of useful code for atomic 
file updates and repository-scoped filename management. But adding this 
doesn't have to modify any existing behavior.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Locking binary files
  2008-09-23 21:13               ` Daniel Barkalow
@ 2008-09-23 21:54                 ` Dmitry Potapov
  2008-09-23 22:29                   ` Daniel Barkalow
  0 siblings, 1 reply; 19+ messages in thread
From: Dmitry Potapov @ 2008-09-23 21:54 UTC (permalink / raw)
  To: Daniel Barkalow
  Cc: Junio C Hamano, Mario Pareja, Andreas Ericsson, Git Mailing List

On Tue, Sep 23, 2008 at 05:13:29PM -0400, Daniel Barkalow wrote:
> 
> The lock needs to last until you push to the repository the lock is for; 
> otherwise you have the exclusive ability to make changes, but someone who 
> grabs the lock right after you release it will still be working on the 
> version without your change, which is what the lock is supposed to 
> prevent.

It still will happen if developers work on topic branches, and it is not
a rate situation with Git. Thus locking some particular path is stupid.
What you may want instead is too mark SHA-1 of this file as being edited
and later maybe as being replaced with another one. In this case, anyone
who has the access to the central information storage will get warning
about attempt to edit a file that is edited or already replaced with a
new version.

Dmitry

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Locking binary files
  2008-09-23 21:54                 ` Dmitry Potapov
@ 2008-09-23 22:29                   ` Daniel Barkalow
  2008-09-23 23:21                     ` Dmitry Potapov
  0 siblings, 1 reply; 19+ messages in thread
From: Daniel Barkalow @ 2008-09-23 22:29 UTC (permalink / raw)
  To: Dmitry Potapov
  Cc: Junio C Hamano, Mario Pareja, Andreas Ericsson, Git Mailing List

On Wed, 24 Sep 2008, Dmitry Potapov wrote:

> On Tue, Sep 23, 2008 at 05:13:29PM -0400, Daniel Barkalow wrote:
> > 
> > The lock needs to last until you push to the repository the lock is for; 
> > otherwise you have the exclusive ability to make changes, but someone who 
> > grabs the lock right after you release it will still be working on the 
> > version without your change, which is what the lock is supposed to 
> > prevent.
> 
> It still will happen if developers work on topic branches, and it is not
> a rate situation with Git. Thus locking some particular path is stupid.
> What you may want instead is too mark SHA-1 of this file as being edited
> and later maybe as being replaced with another one. In this case, anyone
> who has the access to the central information storage will get warning
> about attempt to edit a file that is edited or already replaced with a
> new version.

No, your goal is to avoid having to do a merge in order to do a particular 
push. That push is the push to the shared location. It doesn't matter if 
you use topic branches, because your eventual goal is still to push to the 
shared location (or, possibly, to have the project maintainer push to the 
shared location with some sort of interesting delegation), so you lock the 
shared location, not your topic branch.

On the other hand, it's easily possible that other people (or you) want to 
fork the image, such that only some locations (either different paths in 
the project or the same path in different branches) get your change and 
other branches get different changes made at the same time. Of course, if 
you want to change multiple things, you need to get multiple locks.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Locking binary files
  2008-09-23 22:29                   ` Daniel Barkalow
@ 2008-09-23 23:21                     ` Dmitry Potapov
  2008-09-24  4:15                       ` Daniel Barkalow
  0 siblings, 1 reply; 19+ messages in thread
From: Dmitry Potapov @ 2008-09-23 23:21 UTC (permalink / raw)
  To: Daniel Barkalow
  Cc: Junio C Hamano, Mario Pareja, Andreas Ericsson, Git Mailing List

On Tue, Sep 23, 2008 at 06:29:53PM -0400, Daniel Barkalow wrote:
> On Wed, 24 Sep 2008, Dmitry Potapov wrote:
> 
> > It still will happen if developers work on topic branches, and it is not
> > a rate situation with Git. Thus locking some particular path is stupid.
> > What you may want instead is too mark SHA-1 of this file as being edited
> > and later maybe as being replaced with another one. In this case, anyone
> > who has the access to the central information storage will get warning
> > about attempt to edit a file that is edited or already replaced with a
> > new version.
> 
> No, your goal is to avoid having to do a merge in order to do a particular 
> push. That push is the push to the shared location. It doesn't matter if 
> you use topic branches, because your eventual goal is still to push to the 
> shared location (or, possibly, to have the project maintainer push to the 
> shared location with some sort of interesting delegation), so you lock the 
> shared location, not your topic branch.

What are you saying is that when I am locking some file on the current
branch, Git (or whatever script that performs this locking) should figure
out what is the original shared branch for it and lock the file there.
When you have finished to edit and push changes then the lock should be
removed if changes are pushed to this shared branch, otherwise it should
be some token of delegation to the project maintainer who is going to
push (or probably first merge, because other files may need that) to
this branch.

Maybe, it can work, but it sounds too complex to me. I believe that my
idea using SHA-1 is better. After all, what is file? It is its content.
At least, in Git, we always identify files by their content. Thus if you
lock some file, you put a lock on certain SHA-1. Now, regardless of
branches and paths, this lock can work provided that you have access to
some shared location. Of course, this lock is purely advisory, but it is
good, because you may want to ignore it in some case. For instance, you
want to created a new branch based on the current shared location and
have no plan to ever merge it back. In this case, the lock on the shared
branch should not matter to you. This is true regardless how you
implement locking, and in your scheme it will another special case.

Dmitry

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Locking binary files
  2008-09-23 23:21                     ` Dmitry Potapov
@ 2008-09-24  4:15                       ` Daniel Barkalow
  2008-09-24 15:00                         ` Dmitry Potapov
  0 siblings, 1 reply; 19+ messages in thread
From: Daniel Barkalow @ 2008-09-24  4:15 UTC (permalink / raw)
  To: Dmitry Potapov
  Cc: Junio C Hamano, Mario Pareja, Andreas Ericsson, Git Mailing List

On Wed, 24 Sep 2008, Dmitry Potapov wrote:

> On Tue, Sep 23, 2008 at 06:29:53PM -0400, Daniel Barkalow wrote:
> > On Wed, 24 Sep 2008, Dmitry Potapov wrote:
> > 
> > > It still will happen if developers work on topic branches, and it is not
> > > a rate situation with Git. Thus locking some particular path is stupid.
> > > What you may want instead is too mark SHA-1 of this file as being edited
> > > and later maybe as being replaced with another one. In this case, anyone
> > > who has the access to the central information storage will get warning
> > > about attempt to edit a file that is edited or already replaced with a
> > > new version.
> > 
> > No, your goal is to avoid having to do a merge in order to do a particular 
> > push. That push is the push to the shared location. It doesn't matter if 
> > you use topic branches, because your eventual goal is still to push to the 
> > shared location (or, possibly, to have the project maintainer push to the 
> > shared location with some sort of interesting delegation), so you lock the 
> > shared location, not your topic branch.
> 
> What are you saying is that when I am locking some file on the current
> branch, Git (or whatever script that performs this locking) should figure
> out what is the original shared branch for it and lock the file there.

Or you should have to say. But "git lock <filename>" should probably 
put the lock on whatever branch "git push" would push to, and similarly 
for the other argument combinations that "git push" permits.

> When you have finished to edit and push changes then the lock should be
> removed if changes are pushed to this shared branch, otherwise it should
> be some token of delegation to the project maintainer who is going to
> push (or probably first merge, because other files may need that) to
> this branch.

Correct.

> Maybe, it can work, but it sounds too complex to me. I believe that my
> idea using SHA-1 is better. After all, what is file? It is its content.
> At least, in Git, we always identify files by their content.

Not at all; there are plenty of cases where what matters is the path, and 
some things are relevant by virtue of the form of the filename which names 
that content.

> Thus if you lock some file, you put a lock on certain SHA-1. Now, 
> regardless of branches and paths, this lock can work provided that you 
> have access to some shared location. Of course, this lock is purely 
> advisory, but it is good, because you may want to ignore it in some 
> case.

In my design, the lock (on the shared repository) is not advisory; if 
someone else has it, you can't push if the new commit doesn't match the 
old commit for that path. (Of course, the system might let you break the 
other person's lock.) I don't think locks are particularly useful if you 
don't get some particular guarantee out ofhaving them (in my case, that 
somebody else will have to do any merge for the file if one is needed).

> For instance, you want to created a new branch based on the 
> current shared location and have no plan to ever merge it back. In this 
> case, the lock on the shared branch should not matter to you. This is 
> true regardless how you implement locking, and in your scheme it will 
> another special case.

If you have no intention to merge a local branch back to the remote branch 
it is based on, then you won't have the remote configured for this. If the 
locking and lock-checking code uses the push configuration to determine 
what locks make sense, it'll automatically be unrelated.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Locking binary files
  2008-09-24  4:15                       ` Daniel Barkalow
@ 2008-09-24 15:00                         ` Dmitry Potapov
  0 siblings, 0 replies; 19+ messages in thread
From: Dmitry Potapov @ 2008-09-24 15:00 UTC (permalink / raw)
  To: Daniel Barkalow
  Cc: Junio C Hamano, Mario Pareja, Andreas Ericsson, Git Mailing List

On Wed, Sep 24, 2008 at 12:15:39AM -0400, Daniel Barkalow wrote:
> On Wed, 24 Sep 2008, Dmitry Potapov wrote:
> 
> > 
> > What are you saying is that when I am locking some file on the current
> > branch, Git (or whatever script that performs this locking) should figure
> > out what is the original shared branch for it and lock the file there.
> 
> Or you should have to say. But "git lock <filename>" should probably 
> put the lock on whatever branch "git push" would push to, and similarly 
> for the other argument combinations that "git push" permits.

It seems to me very fragile to rely on the push configuration in deciding
what can be locked and what cannot. Besides this configuration can change
over time. So what is going to happen with locks then? Another problem:
what if I don't push anyway but usually send pull-requests?

The fact is if you cannot get your locking working in _one_ repository
then any hope that it will work when you have more than one is nothing
but a pipe dream.

> 
> > Maybe, it can work, but it sounds too complex to me. I believe that my
> > idea using SHA-1 is better. After all, what is file? It is its content.
> > At least, in Git, we always identify files by their content.
> 
> Not at all; there are plenty of cases where what matters is the path, and 
> some things are relevant by virtue of the form of the filename which names 
> that content.

Whether it matters or not depends on a particular workflow and what the
developer wants to achieve. Such decisions should be taken by human
being, otherwise you are prone to do the wrong things too often.

> 
> > Thus if you lock some file, you put a lock on certain SHA-1. Now, 
> > regardless of branches and paths, this lock can work provided that you 
> > have access to some shared location. Of course, this lock is purely 
> > advisory, but it is good, because you may want to ignore it in some 
> > case.
> 
> In my design, the lock (on the shared repository) is not advisory; if 
> someone else has it, you can't push if the new commit doesn't match the 
> old commit for that path.

Hey, if someone wants to push this file, it means it is already late,
because you _already_ have the situation where two people have edited
exactly the same binary file. Isn't the situation that the lock was
intended to prevent?

So, the goal should be to warn someone who is going to edit file locked
by someone else. You cannot prevent him/her from doing so, only to warn
about that.

As to pushing, it can be different policies. IMHO, the update hook is
the best place to express what push you want to allow and what not, but
some workflow may not use push at all, yet ability to lock (perhaps,
'synchronize' would be a better word here) may still be needed.

Dmitry

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Locking binary files
  2008-09-23 13:56         ` Mario Pareja
  2008-09-23 14:28           ` Alex Riesen
  2008-09-23 17:32           ` Daniel Barkalow
@ 2008-09-23 20:46           ` Dmitry Potapov
  2 siblings, 0 replies; 19+ messages in thread
From: Dmitry Potapov @ 2008-09-23 20:46 UTC (permalink / raw)
  To: Mario Pareja; +Cc: Andreas Ericsson, Git Mailing List

On Tue, Sep 23, 2008 at 09:56:57AM -0400, Mario Pareja wrote:
> 
> The SVN client will make locked files read-only until a lock is
> obtained for them.  This helps "remind" you that a lock should be
> obtained before editing such a file. Requiring the developer to obtain
> a lock ensures that nobody else is editing the file and prevents
> wasted work.  Upon commit, the file is marked as unlocked and the
> local file is once again read-only.

The approach that SVN takes is not only impossible for distributed
environment, it does not work even in a _single_ repository where you
have branching and merging. If you have a topic branch then your lock
will have a zero effect on other developers or lock of other developers
on you. Obviously, you are going to have the binary merge conflict at
the end. But it is even worse than that. Somebody locked a file on the
master branch and you clone from it. Now, this somebody unlocked this
file, but this file on your branch remains locked but this person, and
this person may even not aware that about your branch. That is insane!

Dmitry

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Locking binary files
  2008-09-23  6:39 ` Locking binary files Mario Pareja
  2008-09-23  7:18   ` Andreas Ericsson
@ 2008-09-23 11:16   ` Boaz Harrosh
  2008-09-23 11:20     ` Boaz Harrosh
  2008-09-23 14:14     ` Mario Pareja
  2008-09-23 13:44   ` Dmitry Potapov
  2 siblings, 2 replies; 19+ messages in thread
From: Boaz Harrosh @ 2008-09-23 11:16 UTC (permalink / raw)
  To: Mario Pareja; +Cc: git

Mario Pareja wrote:
> Hi,
> 
> For one and a half years, I have been keeping my eyes on the git
> community in hopes of making the switch away from SVN.  One particular
> issue holding me back is the inability to lock binary files.
> Throughout the past year, I have yet to see developments on this
> issue.  I understand that locking files goes against the fundamental
> principles of distributed source control, but I think we need to come
> up with some workarounds.  For Linux kernel development this is may
> not be an issue; however, for application development this is a major
> issue. How else can one developer be sure that time spent editing a
> binary file will not be wasted because another developer submitted a
> change?
> 
> To achieve the effects of locking, a "central" repository must be
> identified.  Regardless of the distributed nature of git, most
> _companies_ will have a "central" repository for a software project.
> We should be able to mark a file as requiring a lock from the
> governing git repository at a specified address.  Is this made
> difficult because git tracks file contents not files?
> 
> In any case, I think this is a crucial issue that needs to be
> addressed if git is going to be adopted by companies with binary file
> conflict potential. I don't see how a web development company can take
> advantage of git to track source code and image file changes.  Any
> advice would be great!
> 
> Regards,
> 
> Mario
> --

It should be easy for a company to set a policy where a couple of scripts
must be run for particular type of files. Given that, the implementation
of such scripts is easy:

For every foo.bin there is possibly a foo.bin.lock file.

Lock-script look for absence of the lock-file at upstream then git-add
the file (With some info that tells users things like who has the file).
If git-push fails, since I'm adding a file and someone already added
it while I was pushing, then the lock is not granted.

Unlock-script will git-rm the lock-file and push.

In both scripts mod-bits of original file can be toggled for
read-only/write signaling to the user. (At upstream the file is always
read-only)

This can also work in a distributed system with more then one tier of
servers. (Locks pushed to the most upstream server)

Combine that with git's mail notifications for commits and you have a
system far more robust then svn will ever want to be

My $0.017
Boaz

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Locking binary files
  2008-09-23 11:16   ` Boaz Harrosh
@ 2008-09-23 11:20     ` Boaz Harrosh
  2008-09-23 14:14     ` Mario Pareja
  1 sibling, 0 replies; 19+ messages in thread
From: Boaz Harrosh @ 2008-09-23 11:20 UTC (permalink / raw)
  To: Mario Pareja; +Cc: git

Boaz Harrosh wrote:
> Mario Pareja wrote:
>> Hi,
>>
>> For one and a half years, I have been keeping my eyes on the git
>> community in hopes of making the switch away from SVN.  One particular
>> issue holding me back is the inability to lock binary files.
>> Throughout the past year, I have yet to see developments on this
>> issue.  I understand that locking files goes against the fundamental
>> principles of distributed source control, but I think we need to come
>> up with some workarounds.  For Linux kernel development this is may
>> not be an issue; however, for application development this is a major
>> issue. How else can one developer be sure that time spent editing a
>> binary file will not be wasted because another developer submitted a
>> change?
>>
>> To achieve the effects of locking, a "central" repository must be
>> identified.  Regardless of the distributed nature of git, most
>> _companies_ will have a "central" repository for a software project.
>> We should be able to mark a file as requiring a lock from the
>> governing git repository at a specified address.  Is this made
>> difficult because git tracks file contents not files?
>>
>> In any case, I think this is a crucial issue that needs to be
>> addressed if git is going to be adopted by companies with binary file
>> conflict potential. I don't see how a web development company can take
>> advantage of git to track source code and image file changes.  Any
>> advice would be great!
>>
>> Regards,
>>
>> Mario
>> --
> 
> It should be easy for a company to set a policy where a couple of scripts
> must be run for particular type of files. Given that, the implementation
> of such scripts is easy:
> 
> For every foo.bin there is possibly a foo.bin.lock file.
> 
> Lock-script look for absence of the lock-file at upstream then git-add
> the file (With some info that tells users things like who has the file).
> If git-push fails, since I'm adding a file and someone already added
> it while I was pushing, then the lock is not granted.
> 
> Unlock-script will git-rm the lock-file and push.
> 
> In both scripts mod-bits of original file can be toggled for
> read-only/write signaling to the user. (At upstream the file is always
> read-only)
> 
> This can also work in a distributed system with more then one tier of
> servers. (Locks pushed to the most upstream server)
> 
> Combine that with git's mail notifications for commits and you have a
> system far more robust then svn will ever want to be
> 
> My $0.017
> Boaz
> 

OK combine that with a technic presented in this ml thread:
  "Management of opendocument (openoffice.org) files in git" 
And make all this automatic for particular type of files

Boaz

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Locking binary files
  2008-09-23 11:16   ` Boaz Harrosh
  2008-09-23 11:20     ` Boaz Harrosh
@ 2008-09-23 14:14     ` Mario Pareja
  2008-09-23 14:35       ` Boaz Harrosh
  1 sibling, 1 reply; 19+ messages in thread
From: Mario Pareja @ 2008-09-23 14:14 UTC (permalink / raw)
  To: Boaz Harrosh; +Cc: git

> It should be easy for a company to set a policy where a couple of scripts
> must be run for particular type of files. Given that, the implementation
> of such scripts is easy:
>
> For every foo.bin there is possibly a foo.bin.lock file.
>
> Lock-script look for absence of the lock-file at upstream then git-add
> the file (With some info that tells users things like who has the file).
> If git-push fails, since I'm adding a file and someone already added
> it while I was pushing, then the lock is not granted.
>
> Unlock-script will git-rm the lock-file and push.
>
> In both scripts mod-bits of original file can be toggled for
> read-only/write signaling to the user. (At upstream the file is always
> read-only)
>
> This can also work in a distributed system with more then one tier of
> servers. (Locks pushed to the most upstream server)
>
> Combine that with git's mail notifications for commits and you have a
> system far more robust then svn will ever want to be
>
> My $0.017
> Boaz
>

This is a reasonable approach to obtaining the desired functionality.
Unfortunately, I have not seen any third-party packages implementing
such a feature.  It seems to me the problem is general enough to be
solved once rather than requiring organizations wishing to use git to
implement an in-house locking system. It simply creates more friction.
Perhaps, when I have the time, I will come up with something others
can use.  For now, unfortunately, it seems I am out of luck?

Mario

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Locking binary files
  2008-09-23 14:14     ` Mario Pareja
@ 2008-09-23 14:35       ` Boaz Harrosh
  0 siblings, 0 replies; 19+ messages in thread
From: Boaz Harrosh @ 2008-09-23 14:35 UTC (permalink / raw)
  To: Mario Pareja; +Cc: git

Mario Pareja wrote:
>> It should be easy for a company to set a policy where a couple of scripts
>> must be run for particular type of files. Given that, the implementation
>> of such scripts is easy:
>>
>> For every foo.bin there is possibly a foo.bin.lock file.
>>
>> Lock-script look for absence of the lock-file at upstream then git-add
>> the file (With some info that tells users things like who has the file).
>> If git-push fails, since I'm adding a file and someone already added
>> it while I was pushing, then the lock is not granted.
>>
>> Unlock-script will git-rm the lock-file and push.
>>
>> In both scripts mod-bits of original file can be toggled for
>> read-only/write signaling to the user. (At upstream the file is always
>> read-only)
>>
>> This can also work in a distributed system with more then one tier of
>> servers. (Locks pushed to the most upstream server)
>>
>> Combine that with git's mail notifications for commits and you have a
>> system far more robust then svn will ever want to be
>>
>> My $0.017
>> Boaz
>>
> 
> This is a reasonable approach to obtaining the desired functionality.
> Unfortunately, I have not seen any third-party packages implementing
> such a feature.  It seems to me the problem is general enough to be
> solved once rather than requiring organizations wishing to use git to
> implement an in-house locking system. It simply creates more friction.
> Perhaps, when I have the time, I will come up with something others
> can use.  For now, unfortunately, it seems I am out of luck?
> 
> Mario
> --

The open-source my friend. First comes first implements. More and more
development platforms use XML files in where they used a binary file
format before. Just for these cases. Git is mostly used with open-source
and/or very new systems that don't have binary file formats. OK graphics is
another thing, I guess.

So you are welcome to it. "git-lock" is available

Boaz

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Locking binary files
  2008-09-23  6:39 ` Locking binary files Mario Pareja
  2008-09-23  7:18   ` Andreas Ericsson
  2008-09-23 11:16   ` Boaz Harrosh
@ 2008-09-23 13:44   ` Dmitry Potapov
  2 siblings, 0 replies; 19+ messages in thread
From: Dmitry Potapov @ 2008-09-23 13:44 UTC (permalink / raw)
  To: Mario Pareja; +Cc: git

On Tue, Sep 23, 2008 at 02:39:41AM -0400, Mario Pareja wrote:
> 
> How else can one developer be sure that time spent editing a
> binary file will not be wasted because another developer submitted a
> change?

That sounds to me more like a communication problem than anything
related to Git itself.

> 
> To achieve the effects of locking, a "central" repository must be
> identified.  Regardless of the distributed nature of git, most
> _companies_ will have a "central" repository for a software project.
> We should be able to mark a file as requiring a lock from the
> governing git repository at a specified address.  Is this made
> difficult because git tracks file contents not files?

The problem exists regardless the distributed nature of git. Let's
consider a single repository with only two branches: A and B. Now, one
developer has decided to edit some binary file called pretty.img on A.
Should this file be locked only on the branch A or on both branches? The
answer is if A is going to merge to B then this file on B too and remain
locking till A is merged to B. In fact, it may be *absolutely* pointless
to lock the file on the developer's topic branch, because another
developer can edit it on another topic branch without noticing that this
lock exists at all. So, it may be enough to lock it only B enough, but
this is impossible to Git to know, because Git does not understand
_your_ particular workflow, and without any locking scheme is rather
meaningless.

Perhaps, a more general solution can be based exactly on the content,
not on the name, i.e. in some share directory on the server I create
a file with name based on SHA-1 of the binary file where I put comment
explaining why I locked it. Obviously, this lock is purely advisory,
but it is good, in some situation you really may want to edit two
files with the same SHA-1 on different branches that never get merge.
Moreover, this lock is never deleted. So, it could make sense instead
of having a separate file per lock to organize it in some more compact
storage, which may look like history of editing binary files... But it
is just an idea how I would do that.

Dmitry

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2008-09-24 15:02 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <94c1db200809222333q4953a6b9g8ce0c1cd4b8f5eb4@mail.gmail.com>
2008-09-23  6:39 ` Locking binary files Mario Pareja
2008-09-23  7:18   ` Andreas Ericsson
     [not found]     ` <94c1db200809230054t20e7e61dh5022966d4112eee6@mail.gmail.com>
2008-09-23  8:31       ` Andreas Ericsson
2008-09-23 13:56         ` Mario Pareja
2008-09-23 14:28           ` Alex Riesen
2008-09-23 17:32           ` Daniel Barkalow
2008-09-23 19:49             ` Junio C Hamano
2008-09-23 21:13               ` Daniel Barkalow
2008-09-23 21:54                 ` Dmitry Potapov
2008-09-23 22:29                   ` Daniel Barkalow
2008-09-23 23:21                     ` Dmitry Potapov
2008-09-24  4:15                       ` Daniel Barkalow
2008-09-24 15:00                         ` Dmitry Potapov
2008-09-23 20:46           ` Dmitry Potapov
2008-09-23 11:16   ` Boaz Harrosh
2008-09-23 11:20     ` Boaz Harrosh
2008-09-23 14:14     ` Mario Pareja
2008-09-23 14:35       ` Boaz Harrosh
2008-09-23 13:44   ` Dmitry Potapov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).