* git clone, hardlinks and multiple users?
@ 2012-01-20 17:31 Marc Herbert
2012-01-21 22:54 ` Neal Kreitzinger
2012-01-23 17:55 ` Marc Herbert
0 siblings, 2 replies; 3+ messages in thread
From: Marc Herbert @ 2012-01-20 17:31 UTC (permalink / raw)
To: git
Hi,
"git clone" is using hardlinks by default, even when cloning from a
different user. In such a case the clone ends up with a number of files
owned by someone else.
Since only immutable objects are cloned this seems to work fine. However
I would like to know if this "multiple users" case works by chance or by
specification.
In other words, is there a guarantee that no later version of git or no
obscure option I haven't used yet will ever try to touch a hardlink in
any way like for instance: trying update some metadata timestamp or,
overwrite it with the same value by lack of optimization, or any other
kind of side-effect that would obviously fail.
Thanks in advance!
Marc
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: git clone, hardlinks and multiple users?
2012-01-20 17:31 git clone, hardlinks and multiple users? Marc Herbert
@ 2012-01-21 22:54 ` Neal Kreitzinger
2012-01-23 17:55 ` Marc Herbert
1 sibling, 0 replies; 3+ messages in thread
From: Neal Kreitzinger @ 2012-01-21 22:54 UTC (permalink / raw)
To: Marc Herbert; +Cc: git
On 1/20/2012 11:31 AM, Marc Herbert wrote:
> Hi,
>
> "git clone" is using hardlinks by default, even when cloning from a
> different user. In such a case the clone ends up with a number of
> files owned by someone else.
>
(I assume your using linux.) It sounds like you specified a url syntax
of /path/to/repo.git in your git-clone which tells git to use hardlinks.
If you want your own copies then specify file:///path/to/repo.git in
git-clone (see git-clone manpage section "GIT URLS":
http://schacon.github.com/git/git-clone.html).
> Since only immutable objects are cloned this seems to work fine.
> However I would like to know if this "multiple users" case works by
> chance or by specification.
>
(I'm not an expert on hardlinks, linux metadata, or git, and haven't
used hardlinks at all with linux or git yet, but do have some experience
with git and permissions.) I think if you plan your permissions to be
based on a primary group then it will "just work". If its not as simple
as a single primary group, then read on for my non-expert conversational
input, or at least skim thru for pointers to the reliable manpage
references...
It sounds like part of your question may actually be a hardlink
question so perhaps this info on hardlinks is useful:
http://linfo.org/hard_link.html to you. In regards to git, it does not
track metadata. However, it will track
"permissions" if you tell it to, but even then it only tracks the
executable bit to determine if its stored in the git repo as executable
or non-executable. If you are "changing" the metadata because you
modified the file contents (or executable bit) then
you are creating a new object (in git) and not modifying the original
hardlinked object (in git or linux) or its metadata (in linux). I
assume the working-tree (ie., WORKTREE/ of WORKTREE/.git repo) of the
clone is indeed a full copy of the files via git-checkout because the
manpage only claims to use hardlinks for the object store (ie.
.git/objects/) to save diskspace on the clone of the object store, not
the checkout of the worktree. Worktree objects only get written
to the object store if you stage them to the index (git-add). Then they
are stored in .git/objects/ according to the sha-1 of their
contents. Therefore, if your worktree copy has a different owner and
you don't modify the contents or executable bit then you can't possibly
stage it because git does not detect a difference in content or
executable bit. On the other hand, if you change the contents or the
executable bit then git will consider that a change and update the
object store, but it will be a new object and not the object
representing the previous version you hardlinked to when you cloned. If
that new object is then in turn pushed to the origin repo and someone
else clones it using hardlinks then they may very well not
be able to access that object if its owner:group excludes them. More
likely, if someone pushes an object with bad permissions then others
will get push errors because git stores objects in subdirs named after
the first two chars of the sha-1 which means other objects in that
subdir will also be inaccessible. If you change permissions in regard
to executable bit on your files without editing contents then I don't
know if git will make a new copy or modify the original inode because
I'm not sure if the executable bit permissions is represented in the
sha-1 contents or not. In the git-init manpage there are options for
permissions/sharing under the --shared option (not to be confused with
the --shared option of git-clone which it totally different). The
git-clone equivalent appears to be "git-clone --config
core.sharedRepository=<your-value>". Maybe these core.sharedRepository
settings in git are smart enough to handle the hardlink shared inode
metadata confusion.
> In other words, is there a guarantee that no later version of git or
> no obscure option I haven't used yet will ever try to touch a
> hardlink in any way like for instance: trying update some metadata
> timestamp or, overwrite it with the same value by lack of
> optimization, or any other kind of side-effect that would obviously
> fail.
>
However, if you cd to .git/objects/ and use chmod to change the
permission directly then I think it would change the permissions on the
inodes your origin is storing as loose objects. I'm not sure what it
would do for packed objects. There are clone options like --shared and
--reference that have special notes on the manpage explaining how you
could break things if you don't know what you're doing (that would
include hardlinks but is not exclusive to hardlinks).
Hope this helps in some way. Perhaps someone better informed will
provide a more accurate and/or clear answer. Let me know what
you find out because I too will have to become more concerned about
diskspace and clone optimization in the very near future.
v/r,
neal
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: git clone, hardlinks and multiple users?
2012-01-20 17:31 git clone, hardlinks and multiple users? Marc Herbert
2012-01-21 22:54 ` Neal Kreitzinger
@ 2012-01-23 17:55 ` Marc Herbert
1 sibling, 0 replies; 3+ messages in thread
From: Marc Herbert @ 2012-01-23 17:55 UTC (permalink / raw)
To: git
On 20/01/2012 17:31, Marc Herbert wrote:
> "git clone" is using hardlinks by default, even when cloning from a
> different user. In such a case the clone ends up with a number of files
> owned by someone else.
>
> Since only immutable objects are cloned this seems to work fine. However
> I would like to know if this "multiple users" case works by chance or by
> specification.
Sorry I meant: "since only immutable objects are HARDLINKED this seems
to work fine".
A few other clarifications following Neal's long answer:
- Yes we are using Linux. But the question is about any filesystem
supporting hardlinks and user permissions.
- My question is only about hardlinks in .git/objects/. Whatever happens
in the checkout is irrelevant.
- I know how to clone with no hardlink and completely avoid the whole
issue. Unfortunately people have this strange habit of using the
simplest/default option, and it does hardlinks.
I guess my rephrased question is: while there is no obvious reason for
git to attempt to touch files in .git/objects/, is there a promise that
this will never, ever happen? Because it would fail in a multi-users config.
The "core.sharedRepository" option is good example. When set to a new
value will it ever try to fix existing objects? That would fail.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2012-01-23 17:56 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-01-20 17:31 git clone, hardlinks and multiple users? Marc Herbert
2012-01-21 22:54 ` Neal Kreitzinger
2012-01-23 17:55 ` Marc Herbert
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).