git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* empty directories
@ 2007-08-21 17:14 Josh England
  2007-08-21 17:40 ` Sean
                   ` (2 more replies)
  0 siblings, 3 replies; 41+ messages in thread
From: Josh England @ 2007-08-21 17:14 UTC (permalink / raw)
  To: git

Hi,

Git doesn't seem to allow me to add an empty directory to the index, or
even nested empty directories.  Is there any way to do this?  What is
the reasoning?  I've got a use case where having empty directories in my
git repository would be *very* valuable.  Any information and help is
greatly appreciated.

-JE

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: empty directories
  2007-08-21 17:14 empty directories Josh England
@ 2007-08-21 17:40 ` Sean
  2007-08-22 21:25   ` Josh England
  2007-08-22  0:06 ` Jakub Narebski
  2007-08-22  4:31 ` Salikh Zakirov
  2 siblings, 1 reply; 41+ messages in thread
From: Sean @ 2007-08-21 17:40 UTC (permalink / raw)
  To: Josh England; +Cc: git

On Tue, 21 Aug 2007 11:14:21 -0600
"Josh England" <jjengla@sandia.gov> wrote:

> Git doesn't seem to allow me to add an empty directory to the index, or
> even nested empty directories.  Is there any way to do this?  What is
> the reasoning?  I've got a use case where having empty directories in my
> git repository would be *very* valuable.  Any information and help is
> greatly appreciated.

Hi Josh,

Git doesn't track empty directories.  There is a brief note about it in
the FAQ:

 http://git.or.cz/gitwiki/GitFaq#head-1fbd4a018d45259c197b169e87dafce2a3c6b5f9

Sean

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: empty directories
  2007-08-21 17:14 empty directories Josh England
  2007-08-21 17:40 ` Sean
@ 2007-08-22  0:06 ` Jakub Narebski
  2007-08-22  4:31 ` Salikh Zakirov
  2 siblings, 0 replies; 41+ messages in thread
From: Jakub Narebski @ 2007-08-22  0:06 UTC (permalink / raw)
  To: git

Josh England wrote:


> Git doesn't seem to allow me to add an empty directory to the index, or
> even nested empty directories.  Is there any way to do this?  What is
> the reasoning?  I've got a use case where having empty directories in my
> git repository would be *very* valuable.  Any information and help is
> greatly appreciated.

Git does not track empty directories [yet], but you can use empty .gitignore
file trick to mark "empty" directories to be added.

There were some discussion about this on git mailing list (see archives),
and this issue is most probably mentioned on GitFaq page in git wiki.

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: empty directories
  2007-08-21 17:14 empty directories Josh England
  2007-08-21 17:40 ` Sean
  2007-08-22  0:06 ` Jakub Narebski
@ 2007-08-22  4:31 ` Salikh Zakirov
  2007-08-22 18:46   ` Linus Torvalds
  2 siblings, 1 reply; 41+ messages in thread
From: Salikh Zakirov @ 2007-08-22  4:31 UTC (permalink / raw)
  To: git

Josh England wrote:
> Git doesn't seem to allow me to add an empty directory to the index, or
> even nested empty directories.  Is there any way to do this?  What is
> the reasoning?  I've got a use case where having empty directories in my
> git repository would be *very* valuable.  Any information and help is
> greatly appreciated.

While the the other replies provided a historical background of how exactly
git handles directories and why it wasn't storing empty directories,
there is no fundamental reason for empty directories not being stored,
it's just nobody got to implement it.

Linus Torvalds posted an untested patch in a recent discussion and requested
that anyone interested in this functionality continued development and testing.

Design discussion: http://lists-archives.org/git/624494-empty-directories.html
Patch: http://marc.info/?l=git&m=118480075313827&w=2

Johannes Schindelin also posted an alternative implementation, which emulates
empty dirs by adding empty .gitignore placeholder to the index.
http://marc.info/?l=git&m=118484785410247&w=2

You could also read the long discussion of the subtle semantic issues that storing empty
directories introduces in the mail thread accessible from above links.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: empty directories
  2007-08-22  4:31 ` Salikh Zakirov
@ 2007-08-22 18:46   ` Linus Torvalds
  2007-08-22 19:12     ` David Kastrup
  0 siblings, 1 reply; 41+ messages in thread
From: Linus Torvalds @ 2007-08-22 18:46 UTC (permalink / raw)
  To: Salikh Zakirov; +Cc: git



On Wed, 22 Aug 2007, Salikh Zakirov wrote:
> 
> Linus Torvalds posted an untested patch in a recent discussion and requested
> that anyone interested in this functionality continued development and testing.

That untested patch was seriously broken - it didn't do the sorting of 
empty directories right. So it would need a lot of other work.

So I'm firmly back in the "just add a '.gitignore' file to the directory" 
camp.

Or you can fake it out entirely by making it an empty subproject, which 
also gives you an empty directory.

			Linus

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: empty directories
  2007-08-22 18:46   ` Linus Torvalds
@ 2007-08-22 19:12     ` David Kastrup
  0 siblings, 0 replies; 41+ messages in thread
From: David Kastrup @ 2007-08-22 19:12 UTC (permalink / raw)
  To: git

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Wed, 22 Aug 2007, Salikh Zakirov wrote:
>> 
>> Linus Torvalds posted an untested patch in a recent discussion and
>> requested that anyone interested in this functionality continued
>> development and testing.
>
> That untested patch was seriously broken - it didn't do the sorting
> of empty directories right.

Well, it depends on where one wants to see directories sorted in the
index: the index sort order does not necessarily need to be the same
as the repository sort order: merge conflict detection could benefit
from sorting the directory "early" in the index.  Of course, this
would mean that one needed to stash away directories temporarily while
processing the index until the corresponding tree in the repository
comes up.

> So it would need a lot of other work.

With either choice of sort order, yes.  One place or the other.

-- 
David Kastrup

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: empty directories
  2007-08-21 17:40 ` Sean
@ 2007-08-22 21:25   ` Josh England
  2007-08-22 23:25     ` Linus Torvalds
  2007-08-22 23:40     ` Jakub Narebski
  0 siblings, 2 replies; 41+ messages in thread
From: Josh England @ 2007-08-22 21:25 UTC (permalink / raw)
  To: git

On Wed, 22 Aug 2007, Linus Torvalds wrote:
> On Wed, 22 Aug 2007, Salikh Zakirov wrote:
> > 
> > Linus Torvalds posted an untested patch in a recent discussion and requested
> > that anyone interested in this functionality continued development and testing.
> 
> That untested patch was seriously broken - it didn't do the sorting of 
> empty directories right. So it would need a lot of other work.
> 
> So I'm firmly back in the "just add a '.gitignore' file to the directory" 
> camp.

Woah.  I just spent much of the morning reading the history of this
thread. My eyes are still bleeding, but I think I'm sufficiently
informed enough to be dangerous.

Without actually sticking my head in the honey pot surrounded by giant
bears, I just want to relate a revision control scenario that I've been
wanting to solve for several years. I deploy/maintain many linux
clusters that each have a single system image to boot all nodes on the
machines. My desire is to shove an *entire* image into a git
repository, and simply have it do the right thing.  Doing so and using
clones/branches/merges to maintain these images would be extremely
useful.  I've attempted this concept with several SCMs using various
workarounds for each but have abandoned each attempt mainly due to
performance issues.  Git shows the best performance by far (to the
point of actually being usable) for this purpose.

Forget about special files as those are almost certainly a lost cause.
I'm willing to use .gitignore in empty directories until a better
solution presents itself.  The main need is for file
ownership/permission, which has been touched on before.  When I clone
an image, I really want an *identical* clone, in every way.  It seems
as though git had this functionality but scrapped it due to issues with
umask and merge type problems?  So the question is:  would there be any
way to bring this functionality back as a non-default configurable
option?  For those of us who need the functionality, we'd be more than
willing to live with some of the side-effects.

The alternatives (involving wrappers and strict policy) just haven't
been idiot-proof enough to be truly viable.  It almost has to be a
built-in capability.  It looks like Nax is doing something close to
this.  Is there anyone else using trying to use git in a similar way?

-JE

PS:  I know this falls outside of git's intended use, but its the
closest thing to something that could work.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: empty directories
  2007-08-22 21:25   ` Josh England
@ 2007-08-22 23:25     ` Linus Torvalds
  2007-08-22 23:55       ` David Kastrup
                         ` (3 more replies)
  2007-08-22 23:40     ` Jakub Narebski
  1 sibling, 4 replies; 41+ messages in thread
From: Linus Torvalds @ 2007-08-22 23:25 UTC (permalink / raw)
  To: Josh England; +Cc: git



On Wed, 22 Aug 2007, Josh England wrote:
>
> The main need is for file ownership/permission, which has been touched 
> on before.  When I clone an image, I really want an *identical* clone, 
> in every way.  It seems as though git had this functionality but 
> scrapped it due to issues with umask and merge type problems?

Well, git had all permission bits, but never ownership. And yes, using 
more than the one user-x-bit ended up being totally unusable for source 
code, because of different people having different umask, so we 
effectively dropped the permission bits too (although the data format was 
retained, so we could re-introduce then with some flag that says "honor 
all permission bits, not just the x bit").

But the ownership thing we've never even tried to support, since it was so 
obviously not something that was appropriate for a distributed project. So 
if you want an identical clone with ownership and (full) permissions, you 
really do need to have some alternate way to fill in the blanks.

I've argued that ".gitattributes" may be an acceptable alternate, 
especially since ownership is often something that is less than "per 
file", and more often "has certain patterns".

> So the question is:  would there be any way to bring this functionality 
> back as a non-default configurable option?  For those of us who need the 
> functionality, we'd be more than willing to live with some of the 
> side-effects.

Full permissions might be easy enough to resurrect, but since it's still 
pointless without ownership, that really isn't even relevant.

But if .gitattributes would work, you probably could introduce both full 
permissions and ownership rules there. We read git attributes for *other* 
reasons when checking files out _anyway_, ie we need the CRLF attribute 
stuff, so adding ownership attributes would not be at all odd.

		Linus

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: empty directories
  2007-08-22 21:25   ` Josh England
  2007-08-22 23:25     ` Linus Torvalds
@ 2007-08-22 23:40     ` Jakub Narebski
  1 sibling, 0 replies; 41+ messages in thread
From: Jakub Narebski @ 2007-08-22 23:40 UTC (permalink / raw)
  To: git

[Cc: Josh England <jjengla@sandia.gov>, git@vger.kernel.org]

Josh England wrote:

> [...]  The main need is for file
> ownership/permission, which has been touched on before.  When I clone
> an image, I really want an *identical* clone, in every way.  It seems
> as though git had this functionality but scrapped it due to issues with
> umask and merge type problems?  So the question is:  would there be any
> way to bring this functionality back as a non-default configurable
> option?  For those of us who need the functionality, we'd be more than
> willing to live with some of the side-effects.
> 
> The alternatives (involving wrappers and strict policy) just haven't
> been idiot-proof enough to be truly viable.  It almost has to be a
> built-in capability.  It looks like Nax is doing something close to
> this.  Is there anyone else using trying to use git in a similar way?

Check out (via e.g. http://git.or.cz/gitwiki/InterfacesFrontendsAndTools
wiki page) IsiSetup which is tool to manage configuration files (including
permissions) which uses git as engine, and metastore which is meant as
tool to use in appropriate hook for storing/restoring permissions etc.

And as Linus told you, if you have time to work on it, you can try to
make .gitattributes work for this...

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: empty directories
  2007-08-22 23:25     ` Linus Torvalds
@ 2007-08-22 23:55       ` David Kastrup
  2007-08-23 15:24       ` Josh England
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 41+ messages in thread
From: David Kastrup @ 2007-08-22 23:55 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Josh England, git

Linus Torvalds <torvalds@linux-foundation.org> writes:

> Full permissions might be easy enough to resurrect, but since it's
> still pointless without ownership, that really isn't even relevant.

I'd not call it entirely pointless without ownership: under most
systems, only root can do chown, so for example a private backup of a
home directory usually has unique ownership (and nothing but the
normal ownership could be restored by a user, anyway).

However, once the user is member of more than a single group and
actually makes _use_ of that, we are getting on thin ice.  But at
least different group ownership is usually much better contained (and
thus reconstructible manually in the case of an emergency) as the
permissions are.

Since tracking permissions would be a per-project decision (nothing
else makes any sense), it should be workable to amend the tree records
themselves by adding ownership and ACL and whatever else optionally
right there in-place if one figures out a good syntax for it.

One still needs to come up with a good and flexible way to implement
policies: what kind of permissions/ownership data will be let into the
repository from workdir/pushing, and what won't?

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: empty directories
  2007-08-22 23:25     ` Linus Torvalds
  2007-08-22 23:55       ` David Kastrup
@ 2007-08-23 15:24       ` Josh England
  2007-08-23 21:51       ` tracking perms/ownership [was: empty directories] Josh England
  2007-08-24 17:10       ` empty directories Jason Garber
  3 siblings, 0 replies; 41+ messages in thread
From: Josh England @ 2007-08-23 15:24 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

On Wed, 2007-08-22 at 16:25 -0700, Linus Torvalds wrote:
> But if .gitattributes would work, you probably could introduce both full 
> permissions and ownership rules there. We read git attributes for *other* 
> reasons when checking files out _anyway_, ie we need the CRLF attribute 
> stuff, so adding ownership attributes would not be at all odd.

OK, this looks like it has the desired effect.  commits/pulls/etc catch
and update the execute bit.  I'll try to find how .gitattributes hooks
in.  Any pointers/tips are appreciated.

-JE

^ permalink raw reply	[flat|nested] 41+ messages in thread

* tracking perms/ownership [was: empty directories]
  2007-08-22 23:25     ` Linus Torvalds
  2007-08-22 23:55       ` David Kastrup
  2007-08-23 15:24       ` Josh England
@ 2007-08-23 21:51       ` Josh England
  2007-08-23 22:08         ` tracking perms/ownership Junio C Hamano
  2007-08-24  9:38         ` tracking perms/ownership [was: empty directories] Johannes Schindelin
  2007-08-24 17:10       ` empty directories Jason Garber
  3 siblings, 2 replies; 41+ messages in thread
From: Josh England @ 2007-08-23 21:51 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

On Wed, 2007-08-22 at 16:25 -0700, Linus Torvalds wrote:
> But if .gitattributes would work, you probably could introduce both full 
> permissions and ownership rules there. We read git attributes for *other* 
> reasons when checking files out _anyway_, ie we need the CRLF attribute 
> stuff, so adding ownership attributes would not be at all odd.

So here's the initial thought.  Create two new gitattributes, 'perms'
and 'ownership', which will track perms/ownership for files matching the
given pattern.

Looking at the index struct, it already has fields in it for file mode
uid and gid (woohoo!).  It looks like an addition to
builtin-update-index.c could set those fields (if the gitattribute is
set) the same way as how the execute bit is flipped with chmod_path().
I haven't found where the chmod is done at checkout/clone time, but the
question is:  If the mode, uid, and gid are stuffed in the index, will
git diff simply just work to recognize permission/ownership changes?  Is
this the right approach? What kind of merging issues will need to be
worried about?

-JE

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: tracking perms/ownership
  2007-08-23 21:51       ` tracking perms/ownership [was: empty directories] Josh England
@ 2007-08-23 22:08         ` Junio C Hamano
  2007-08-23 23:30           ` Linus Torvalds
                             ` (2 more replies)
  2007-08-24  9:38         ` tracking perms/ownership [was: empty directories] Johannes Schindelin
  1 sibling, 3 replies; 41+ messages in thread
From: Junio C Hamano @ 2007-08-23 22:08 UTC (permalink / raw)
  To: Josh England; +Cc: Linus Torvalds, git

"Josh England" <jjengla@sandia.gov> writes:

> Looking at the index struct, it already has fields in it for file mode
> uid and gid (woohoo!).

I can see that storing textual names in gitattributes and having
the root user run git so that it can chown(), would work.

But this is only about checkout.  After you chown a file in the
work tree and run update-index, next write-tree would not record
it, as there is no place in tree objects to record uid/gid.
You would need to arrange so that a matching change is made in
the gitattributes file if you go that route.

If you had:

	etc/*		owner=root
        etc/frotz	owner=nobody

in gitattributes, and you did a checkout.  You chown etc/nitfol
with "chown printer etc/nitfol".  Somebody needs to add a line

	etc/nitfol	owner=printer

to gitattributes before you make the commit.  Maybe the chown
was not about etc/nitfol but about making etc/frotz owned by
root.  Then you would, instead of adding the etc/nitfol line,
remove existing etc/frotz line so that earlier glob would
capture and express the idea of making everything owned by
root.  I suspect this would get rather tricky quickly.

Of course, you would need to worry about resolving merge
conflicts of gitattributes file, too.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: tracking perms/ownership
  2007-08-23 22:08         ` tracking perms/ownership Junio C Hamano
@ 2007-08-23 23:30           ` Linus Torvalds
  2007-08-24  6:16             ` David Kastrup
  2007-08-24  7:22           ` Josh England
  2007-08-24 16:11           ` Josh England
  2 siblings, 1 reply; 41+ messages in thread
From: Linus Torvalds @ 2007-08-23 23:30 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Josh England, git



On Thu, 23 Aug 2007, Junio C Hamano wrote:
>
> "Josh England" <jjengla@sandia.gov> writes: 
> > Looking at the index struct, it already has fields in it for file mode
> > uid and gid (woohoo!).
> 
> I can see that storing textual names in gitattributes and having
> the root user run git so that it can chown(), would work.

Well, the nice thing is that even non-root can actually resolve merge 
conflicts and generally use the archive, even if non-root obviously cannot 
then actually set the files to those users/groups!

So handling ownership outside of the actual filesystem, in a separate file 
that git tracks, actually allows you to do things that you couldn't 
otherwise sanely do.

It obviously does have downsides:

> But this is only about checkout.  After you chown a file in the
> work tree and run update-index, next write-tree would not record
> it, as there is no place in tree objects to record uid/gid.

This is a direct consequence of allowing non-root to actually work with 
such a repository: the git-tracked ownership information simply is 
separate, and "git update-index" and friends will never do anything about 
it, since they just can't rely on the *filesystem* user/group information 
anyway (because normal users would never be allowed to set it, anyway).

		Linus

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: tracking perms/ownership
  2007-08-23 23:30           ` Linus Torvalds
@ 2007-08-24  6:16             ` David Kastrup
  2007-08-24  6:37               ` Linus Torvalds
  0 siblings, 1 reply; 41+ messages in thread
From: David Kastrup @ 2007-08-24  6:16 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Josh England, git

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Thu, 23 Aug 2007, Junio C Hamano wrote:
>>
>> "Josh England" <jjengla@sandia.gov> writes: 
>> > Looking at the index struct, it already has fields in it for file mode
>> > uid and gid (woohoo!).
>> 
>> I can see that storing textual names in gitattributes and having
>> the root user run git so that it can chown(), would work.
>
> Well, the nice thing is that even non-root can actually resolve
> merge conflicts and generally use the archive, even if non-root
> obviously cannot then actually set the files to those users/groups!
>
> So handling ownership outside of the actual filesystem, in a
> separate file that git tracks, actually allows you to do things that
> you couldn't otherwise sanely do.

Well, about that "sane" bit: I don't see an application for tracking
unrestorable ownership values.

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: tracking perms/ownership
  2007-08-24  6:16             ` David Kastrup
@ 2007-08-24  6:37               ` Linus Torvalds
  2007-08-24  7:38                 ` Josh England
  2007-08-24  7:50                 ` David Kastrup
  0 siblings, 2 replies; 41+ messages in thread
From: Linus Torvalds @ 2007-08-24  6:37 UTC (permalink / raw)
  To: David Kastrup; +Cc: Junio C Hamano, Josh England, git



On Fri, 24 Aug 2007, David Kastrup wrote:
> >
> > So handling ownership outside of the actual filesystem, in a
> > separate file that git tracks, actually allows you to do things that
> > you couldn't otherwise sanely do.
> 
> Well, about that "sane" bit: I don't see an application for tracking
> unrestorable ownership values.

Umm. Like an RPM spec file?

The thing you "don't see an application" for is exactly the kind of things 
that people very much ALREADY DO. 

There are tons of different setups for setting up user and group ownership 
(and things like permission) in almost any project. And I can pretty much 
*guarantee* you that none of them depend on actually having ownership on 
the files themselves.

In git, just for fun, do

	git grep defattr

or even just look into the Makefile, and think about what lines like that

	$(INSTALL) -d -m755 '$(DESTDIR_SQ)$(bindir_SQ)'

thing means, and why it has a "755" there, and why other Makefiles quite 
often have things like "-o bin" etc on such lines!

See? Those ownership things are restorable *as*root*, but that doesn't 
mean that everybody should do development as root. In fact, I'd argue that 
any system that is set up so that you have to develop and merge things 
while being root is pretty damn broken.

Which means that any such environment *has* to encode the owndership 
*separately* from the actual filesystem ownership. Because doing it in the 
filesystem simply isn't sane.

So yes, you could have an insane piece of crap that actually tracks file 
ownership in the filesystem, and requires people to be root.

Or you could use a ".gitattributes" file or similar _external_ tracking 
method that allows even people who cannot actually set ownership to work 
with it.

Your choice. But I know which one I'd choose.

			Linus

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: tracking perms/ownership
  2007-08-23 22:08         ` tracking perms/ownership Junio C Hamano
  2007-08-23 23:30           ` Linus Torvalds
@ 2007-08-24  7:22           ` Josh England
  2007-08-24  7:39             ` Junio C Hamano
  2007-08-24 16:11           ` Josh England
  2 siblings, 1 reply; 41+ messages in thread
From: Josh England @ 2007-08-24  7:22 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, git

On Thu, 2007-08-23 at 15:08 -0700, Junio C Hamano wrote: 
> "Josh England" <jjengla@sandia.gov> writes:
> 
> > Looking at the index struct, it already has fields in it for file mode
> > uid and gid (woohoo!).
> 
> I can see that storing textual names in gitattributes and having
> the root user run git so that it can chown(), would work.
> 
> But this is only about checkout.  After you chown a file in the
> work tree and run update-index, next write-tree would not record
> it, as there is no place in tree objects to record uid/gid.
> You would need to arrange so that a matching change is made in
> the gitattributes file if you go that route.

That's ok.  Any place to store the data is fine by me.  I'm just
concerned about some comments I saw in attrs.c <line13>:
/*
The basic design decision here is that we are not going to have insanely
large number of attributes.
This is a randomly chosen prime.
*/
#define HASHSIZE 257

Using a brute force perm/ownership attribute set for every file,
assuming a modestly populated linux distribution image having upwards of
150,000 files/directories in it, thats sticking over 100,000 attributes
into some .gitattributes file somewhere.  Do you think the gitattributes
system can handle this kind of abuse?

> If you had:
> 
> 	etc/*		owner=root
>         etc/frotz	owner=nobody
> 
> in gitattributes, and you did a checkout.  You chown etc/nitfol
> with "chown printer etc/nitfol".  Somebody needs to add a line
> 
> 	etc/nitfol	owner=printer
> 
> to gitattributes before you make the commit.

Unless this 'somebody' is an automated process that will never fly. I
want git to do it for me when the right config/attr is set (maybe at
update_index time).  Thats where my concern about the gitattributes
system comes from.  What's going to happen when I stick 150,000 (est)
attributes in there?

> Maybe the chown
> was not about etc/nitfol but about making etc/frotz owned by
> root.  Then you would, instead of adding the etc/nitfol line,
> remove existing etc/frotz line so that earlier glob would
> capture and express the idea of making everything owned by
> root.  I suspect this would get rather tricky quickly.

Maybe doable though.  Starting from the root of the tree, traverse
downwards and only add new attributes when a file or dir's ownership
has changed from the parent, maybe.  This could optimize away many of
the attributes needed.  I think a good place might be right in
index_path() because the lstat data is fresh and accessible.  Writing
attrs out to file if necessary should hopefully not add too much overhead.

-JE

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: tracking perms/ownership
  2007-08-24  6:37               ` Linus Torvalds
@ 2007-08-24  7:38                 ` Josh England
  2007-08-24  7:50                 ` David Kastrup
  1 sibling, 0 replies; 41+ messages in thread
From: Josh England @ 2007-08-24  7:38 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: David Kastrup, Junio C Hamano, git

On Thu, 2007-08-23 at 23:37 -0700, Linus Torvalds wrote:
> 
> On Fri, 24 Aug 2007, David Kastrup wrote:
> > >
> > > So handling ownership outside of the actual filesystem, in a
> > > separate file that git tracks, actually allows you to do things that
> > > you couldn't otherwise sanely do.
> > 
> > Well, about that "sane" bit: I don't see an application for tracking
> > unrestorable ownership values.
> 
> Umm. Like an RPM spec file?
> 
> The thing you "don't see an application" for is exactly the kind of things 
> that people very much ALREADY DO. 
> 
> There are tons of different setups for setting up user and group ownership 
> (and things like permission) in almost any project. And I can pretty much 
> *guarantee* you that none of them depend on actually having ownership on 
> the files themselves.
> 
> In git, just for fun, do
> 
> 	git grep defattr
> 
> or even just look into the Makefile, and think about what lines like that
> 
> 	$(INSTALL) -d -m755 '$(DESTDIR_SQ)$(bindir_SQ)'
> 
> thing means, and why it has a "755" there, and why other Makefiles quite 
> often have things like "-o bin" etc on such lines!

Yes. Permission bits are useful.  I wouldn't want a umask clobbering
some /bin directory to 0644 or some such.

> See? Those ownership things are restorable *as*root*, but that doesn't 
> mean that everybody should do development as root. In fact, I'd argue
>  that any system that is set up so that you have to develop and merge
>  things while being root is pretty damn broken.

If your repository is a full system image (my extreme case), developing
as root (installing packages, altering configs) is *required* if you
expect the image to boot/behave properly.  Squashing ownership in this
case would undoubtedly break many things.

> Which means that any such environment *has* to encode the owndership 
> *separately* from the actual filesystem ownership. Because doing it in the 
> filesystem simply isn't sane.
> 
> So yes, you could have an insane piece of crap that actually tracks file 
> ownership in the filesystem, and requires people to be root.

This is what I've done with SVN.  The mechanisms to save/restore
perms/ownership can be run as hooks before checkin and after checkout.
The performance is pretty depressing even without running those hooks
every time.  I'm just hoping that using .gitattribues will perform
reasonably well.

> Or you could use a ".gitattributes" file or similar _external_ tracking 
> method that allows even people who cannot actually set ownership to work 
> with it.

Yes, although it would be nice if a clone or a pull tells me (running as
a user) that the ownership being set doesn't match the uid/gid in the
attribute file.  Element of least surprise.  For those actually
requesting the behavior, a little extra verbosity seems pretty
acceptable.

-JE

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: tracking perms/ownership
  2007-08-24  7:22           ` Josh England
@ 2007-08-24  7:39             ` Junio C Hamano
  2007-08-24  8:19               ` Josh England
  0 siblings, 1 reply; 41+ messages in thread
From: Junio C Hamano @ 2007-08-24  7:39 UTC (permalink / raw)
  To: Josh England; +Cc: Junio C Hamano, Linus Torvalds, git

"Josh England" <jjengla@sandia.gov> writes:

> That's ok.  Any place to store the data is fine by me.  I'm just
> concerned about some comments I saw in attrs.c <line13>:
> /*
> The basic design decision here is that we are not going to have insanely
> large number of attributes.
> This is a randomly chosen prime.
> */
> #define HASHSIZE 257

That talks about the size of the vocabulary of attribute names,
such as "diff", "crlf", "merge".  IIRC, you need two more
(owner, perm) or maybe three (group), not 150k.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: tracking perms/ownership
  2007-08-24  6:37               ` Linus Torvalds
  2007-08-24  7:38                 ` Josh England
@ 2007-08-24  7:50                 ` David Kastrup
  2007-08-24 17:51                   ` Linus Torvalds
  1 sibling, 1 reply; 41+ messages in thread
From: David Kastrup @ 2007-08-24  7:50 UTC (permalink / raw)
  To: git


[copied to gmane after sending personal copy by accident, since mails
 of me don't arrive on the list from my work account.  Sorry for the
 duplication.]

Linus Torvalds <torvalds@linux-foundation.org> writes:


[make install]

> See? Those ownership things are restorable *as*root*, but that
> doesn't mean that everybody should do development as root. In fact,
> I'd argue that any system that is set up so that you have to develop
> and merge things while being root is pretty damn broken.
>
> Which means that any such environment *has* to encode the owndership
> *separately* from the actual filesystem ownership. Because doing it
> in the filesystem simply isn't sane.

But in this case you have a work directory and an installation
directory.  And you have an installation procedure.  No tracking is
involved at all.

> So yes, you could have an insane piece of crap that actually tracks
> file ownership in the filesystem, and requires people to be root.

In your example, neither installed files nor ownership are tracked in
the filesystem.  Both are "tracked" in the Makefile.  Or rather than
being tracked, they are explicitly catered for by the user.

> Or you could use a ".gitattributes" file or similar _external_
> tracking method that allows even people who cannot actually set
> ownership to work with it.

git is a content _tracker_.  It tracks contents, also contents that
move around.  If it can't track the permissions moving around as well,
it's sort of pointless to integrate this into git: if you have to
manage the stuff yourself, anyway, there is no point in creating the
illusion that it is done by git.

> Your choice. But I know which one I'd choose.

That's fine.  But you don't actually need git at all to implement your
choice, so this is orthogonal to whether having an option to do it
inside of git might be worth having.

-- 
David Kastrup

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: tracking perms/ownership
  2007-08-24  7:39             ` Junio C Hamano
@ 2007-08-24  8:19               ` Josh England
  0 siblings, 0 replies; 41+ messages in thread
From: Josh England @ 2007-08-24  8:19 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, git

On Fri, 2007-08-24 at 00:39 -0700, Junio C Hamano wrote:
> "Josh England" <jjengla@sandia.gov> writes:
> 
> > That's ok.  Any place to store the data is fine by me.  I'm just
> > concerned about some comments I saw in attrs.c <line13>:
> > /*
> > The basic design decision here is that we are not going to have insanely
> > large number of attributes.
> > This is a randomly chosen prime.
> > */
> > #define HASHSIZE 257
> 
> That talks about the size of the vocabulary of attribute names,
> such as "diff", "crlf", "merge".  IIRC, you need two more
> (owner, perm) or maybe three (group), not 150k.

OK that's comforting.  The 150k above though is not # of attribute
*types* (perms/uid/gid or whatever), it is number of attribute *entries*
in a .gitattributes file (eg:  /etc/sudoers  mode=0440 uid=0 gid=0).
Hopefully it shouldn't actually be as high as 150k, i don't know.

-JE

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: tracking perms/ownership [was: empty directories]
  2007-08-23 21:51       ` tracking perms/ownership [was: empty directories] Josh England
  2007-08-23 22:08         ` tracking perms/ownership Junio C Hamano
@ 2007-08-24  9:38         ` Johannes Schindelin
  2007-08-24  9:52           ` Jeff King
  2007-08-24 10:05           ` tracking perms/ownership [was: empty directories] Jeff King
  1 sibling, 2 replies; 41+ messages in thread
From: Johannes Schindelin @ 2007-08-24  9:38 UTC (permalink / raw)
  To: Josh England; +Cc: Linus Torvalds, git

Hi,

On Thu, 23 Aug 2007, Josh England wrote:

> On Wed, 2007-08-22 at 16:25 -0700, Linus Torvalds wrote:
> > But if .gitattributes would work, you probably could introduce both full 
> > permissions and ownership rules there. We read git attributes for *other* 
> > reasons when checking files out _anyway_, ie we need the CRLF attribute 
> > stuff, so adding ownership attributes would not be at all odd.
> 
> So here's the initial thought.  Create two new gitattributes, 'perms'
> and 'ownership', which will track perms/ownership for files matching the
> given pattern.

I wonder why you do not just use the "smudge" and "clean" attributes, and 
store the ownership _and_ the permissions in .gitacls.

Yes, _maybe_ it is something other people might want, too, but let's start 
quick & easy, no?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: tracking perms/ownership [was: empty directories]
  2007-08-24  9:38         ` tracking perms/ownership [was: empty directories] Johannes Schindelin
@ 2007-08-24  9:52           ` Jeff King
  2007-08-24 15:50             ` Josh England
  2007-08-24 10:05           ` tracking perms/ownership [was: empty directories] Jeff King
  1 sibling, 1 reply; 41+ messages in thread
From: Jeff King @ 2007-08-24  9:52 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Josh England, Linus Torvalds, git

On Fri, Aug 24, 2007 at 11:38:13AM +0200, Johannes Schindelin wrote:

> > So here's the initial thought.  Create two new gitattributes, 'perms'
> > and 'ownership', which will track perms/ownership for files matching the
> > given pattern.
> 
> I wonder why you do not just use the "smudge" and "clean" attributes, and 
> store the ownership _and_ the permissions in .gitacls.
> 
> Yes, _maybe_ it is something other people might want, too, but let's start 
> quick & easy, no?

Yes, I think that is a much better idea. Perhaps they aren't that
popular among this crowd, but it seems silly to develop in this
direction and not at least consider people storing actual ACLs (or even
other extended attributes). An already-standard format like that
produced by 'getfacl' should make this pretty trivial (and handles
regular unix permissions at the same time).

-Peff

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: tracking perms/ownership [was: empty directories]
  2007-08-24  9:38         ` tracking perms/ownership [was: empty directories] Johannes Schindelin
  2007-08-24  9:52           ` Jeff King
@ 2007-08-24 10:05           ` Jeff King
  2007-08-25 14:30             ` Johannes Schindelin
  1 sibling, 1 reply; 41+ messages in thread
From: Jeff King @ 2007-08-24 10:05 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Josh England, Linus Torvalds, git

On Fri, Aug 24, 2007 at 11:38:13AM +0200, Johannes Schindelin wrote:

> I wonder why you do not just use the "smudge" and "clean" attributes, and 
> store the ownership _and_ the permissions in .gitacls.

Thinking about this more, are you proposing:

1. Clean and smudge every file, looking up the attributes in .gitacls.
In that case, I think they are not sufficient because the filter script
receives only the blob content on stdin, but never sees the filename.

or

2. Clean and smudge .gitacls, munging the file permissions as a side
effect. In this case, won't some git operations that write the files
break your permissions (i.e., if I update "foo" but not .gitacls, then
the .gitacls filter won't be run and I will be left with git's default
permissions).

-Peff

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: tracking perms/ownership [was: empty directories]
  2007-08-24  9:52           ` Jeff King
@ 2007-08-24 15:50             ` Josh England
  2007-08-24 20:58               ` Jeff King
  0 siblings, 1 reply; 41+ messages in thread
From: Josh England @ 2007-08-24 15:50 UTC (permalink / raw)
  To: Jeff King; +Cc: Johannes Schindelin, Linus Torvalds, git

On Fri, 2007-08-24 at 05:52 -0400, Jeff King wrote:
> On Fri, Aug 24, 2007 at 11:38:13AM +0200, Johannes Schindelin wrote:
> 
> > > So here's the initial thought.  Create two new gitattributes, 'perms'
> > > and 'ownership', which will track perms/ownership for files matching the
> > > given pattern.
> > 
> > I wonder why you do not just use the "smudge" and "clean" attributes, and 
> > store the ownership _and_ the permissions in .gitacls.
> > 
> > Yes, _maybe_ it is something other people might want, too, but let's start 
> > quick & easy, no?
> 
> Yes, I think that is a much better idea. Perhaps they aren't that
> popular among this crowd, but it seems silly to develop in this
> direction and not at least consider people storing actual ACLs (or even
> other extended attributes). An already-standard format like that
> produced by 'getfacl' should make this pretty trivial (and handles
> regular unix permissions at the same time).

Do you mean using acls through contrib/hooks/update-paranoid?  That is
the only place I see any mention of them.  clean and smudge seem out
because they are passed blob objects and have no notion of pathname.  I
don't see how to use this for automatic storing/restoring of
perms/ownership.

-JE

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: tracking perms/ownership
  2007-08-23 22:08         ` tracking perms/ownership Junio C Hamano
  2007-08-23 23:30           ` Linus Torvalds
  2007-08-24  7:22           ` Josh England
@ 2007-08-24 16:11           ` Josh England
  2007-08-24 16:27             ` Josh England
  2 siblings, 1 reply; 41+ messages in thread
From: Josh England @ 2007-08-24 16:11 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, git

On Thu, 2007-08-23 at 15:08 -0700, Junio C Hamano wrote:
> Of course, you would need to worry about resolving merge
> conflicts of gitattributes file, too.

I'm still confused on things.  So, with all perms/ownership stored
as .gitattributes, would mucking around with the index still be
necessary?  I'm not too sure what to do about merge conflicts.

-JE

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: tracking perms/ownership
  2007-08-24 16:11           ` Josh England
@ 2007-08-24 16:27             ` Josh England
  0 siblings, 0 replies; 41+ messages in thread
From: Josh England @ 2007-08-24 16:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, git

On Fri, 2007-08-24 at 10:11 -0600, Josh England wrote:
> On Thu, 2007-08-23 at 15:08 -0700, Junio C Hamano wrote:
> > Of course, you would need to worry about resolving merge
> > conflicts of gitattributes file, too.
> 
> I'm still confused on things.  So, with all perms/ownership stored
> as .gitattributes, would mucking around with the index still be
> necessary?  I'm not too sure what to do about merge conflicts.

OK, let me know if this is completely off-base.  perms/ownership can be
stored in the index at update-index time and restored maybe at
checkout-index time.  Calls to write-tree and read-tree can
store/retrieve the perms/ownership data from a .gitattributes file
somewhere; and something sane needs to be done about merging.  Does this
sound reasonable enough for a first cut?

-JE

^ permalink raw reply	[flat|nested] 41+ messages in thread

* RE: empty directories
  2007-08-22 23:25     ` Linus Torvalds
                         ` (2 preceding siblings ...)
  2007-08-23 21:51       ` tracking perms/ownership [was: empty directories] Josh England
@ 2007-08-24 17:10       ` Jason Garber
  3 siblings, 0 replies; 41+ messages in thread
From: Jason Garber @ 2007-08-24 17:10 UTC (permalink / raw)
  To: git

> But if .gitattributes would work, you probably could introduce both
full 
> permissions and ownership rules there. We read git attributes for
*other* 
> reasons when checking files out _anyway_, ie we need the CRLF
attribute 
> stuff, so adding ownership attributes would not be at all odd.
>
> 		Linus

And as a side-note, it would be quite trivial to write a script to
initially populate a .gitattributes file cleanly (and regen when
needed).

~ JasonG

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: tracking perms/ownership
  2007-08-24  7:50                 ` David Kastrup
@ 2007-08-24 17:51                   ` Linus Torvalds
  2007-08-24 18:15                     ` Josh England
  2007-08-24 21:30                     ` David Kastrup
  0 siblings, 2 replies; 41+ messages in thread
From: Linus Torvalds @ 2007-08-24 17:51 UTC (permalink / raw)
  To: David Kastrup; +Cc: git



On Fri, 24 Aug 2007, David Kastrup wrote:
> >
> > Which means that any such environment *has* to encode the owndership
> > *separately* from the actual filesystem ownership. Because doing it
> > in the filesystem simply isn't sane.
> 
> But in this case you have a work directory and an installation
> directory.  And you have an installation procedure.  No tracking is
> involved at all.

I agree that the cases are different.

I also agree that a tool that is *specialized* to only do basically 
backups (or, equivalently, "distributed installation") would potentially 
be a different issue, and there "it will only run as root" is a reasonable 
thing to do.

But git is, if anything, specialized the other way - which means that I 
think it's perfectly fine to let it know about ownership, but it's *not* a 
valid thing to do to then say "only root can do it". 

Also, even with a distributed installer/backup thing, the fact is, 
"ownership" and "permissions" is simply not well-defined at a filesystem 
level. Are we talking just unix owner/group/mode here? That won't do for a 
lot of filesystems that have ACL's or other extended user/permission 
information. 

> In your example, neither installed files nor ownership are tracked in
> the filesystem.  Both are "tracked" in the Makefile.  Or rather than
> being tracked, they are explicitly catered for by the user.

And I seriously am saying that that is the only way to handle things 
sanely in a distributed content tracker like git.

Because full permissions and ownership (think ACL's) simply aren't 
"content" enough. The way to _reliably_ turn them into "content" that can 
be tracked, is to make it some form of file content.

Because otherwise, you will always hit situations where you simply cannot 
access it sanely. Even as an administrator you might need to do some 
emergency fixup, but you may be on vacation, and the only thing you have 
access to is some machine that you're not root on - and you'd like to send 
a "git bundle" with the fix to your less-than-stellar stand-in that is 
knee-deep in sh*t because he doesn't know the system, and you're on some 
sunny tropical island.

Or just imagine the case where you have slightly different setups for 
different people - some have ACL's, some have just basic permissions. But 
you want to maintain an image that works for both cases. What do you do?

See? If you just accept the fact that ownership and permissions are 
totally "separate content" that is tracked AS CONTENT, and not as the 
filesystem thing, you solve all these problems.

> git is a content _tracker_.  It tracks contents, also contents that
> move around.  If it can't track the permissions moving around as well,
> it's sort of pointless to integrate this into git: if you have to
> manage the stuff yourself, anyway, there is no point in creating the
> illusion that it is done by git.

Fair enough - I'll certainly agree with the notion that we don't 
necessarily need any integration of permissions/ownership into git at 
all, and you can always do it as a totally independent layer.

> > Your choice. But I know which one I'd choose.
> 
> That's fine.  But you don't actually need git at all to implement your
> choice, so this is orthogonal to whether having an option to do it
> inside of git might be worth having.

But I care about git having a *sane*design*, whether I use all the 
features or not. Because I simply care about my tools at a higher level 
than most users do. Which means that it doesn't matter whether I'll use 
permissions/ownership tracking or not - I still require that git do it 
*sanely* from my standpoint of having a good content tracker.

And that means tracking those things *separately*, and not trying to mess 
up the "tree" structure, for example.

			Linus

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: tracking perms/ownership
  2007-08-24 17:51                   ` Linus Torvalds
@ 2007-08-24 18:15                     ` Josh England
  2007-08-24 18:23                       ` Linus Torvalds
  2007-08-24 19:33                       ` Robin Rosenberg
  2007-08-24 21:30                     ` David Kastrup
  1 sibling, 2 replies; 41+ messages in thread
From: Josh England @ 2007-08-24 18:15 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: David Kastrup, git

On Fri, 2007-08-24 at 10:51 -0700, Linus Torvalds wrote:
> Because full permissions and ownership (think ACL's) simply aren't 
> "content" enough. The way to _reliably_ turn them into "content" that can 
> be tracked, is to make it some form of file content.
> 
> Because otherwise, you will always hit situations where you simply cannot 
> access it sanely. Even as an administrator you might need to do some 
> emergency fixup, but you may be on vacation, and the only thing you have 
> access to is some machine that you're not root on - and you'd like to send 
> a "git bundle" with the fix to your less-than-stellar stand-in that is 
> knee-deep in sh*t because he doesn't know the system, and you're on some 
> sunny tropical island.

Using the .gitattributes approach essentially does turn perms/ownership
into trackable content.  A non-root user could specify the ownership of
certain files just by editing the .gitattributes, much in the same way a
non-root user can create an initramfs filesystem.

> Or just imagine the case where you have slightly different setups for 
> different people - some have ACL's, some have just basic permissions. But 
> you want to maintain an image that works for both cases. What do you do?

punt  :)   Simple unix ownership and perms are a good first cut.  ACL's
could probably be handled in much the same way, but converting between
unix perms and ACLs might have to be a separate attribute/filter
entirely.

> See? If you just accept the fact that ownership and permissions are 
> totally "separate content" that is tracked AS CONTENT, and not as the 
> filesystem thing, you solve all these problems.
> 
> > git is a content _tracker_.  It tracks contents, also contents that
> > move around.  If it can't track the permissions moving around as well,
> > it's sort of pointless to integrate this into git: if you have to
> > manage the stuff yourself, anyway, there is no point in creating the
> > illusion that it is done by git.
> 
> Fair enough - I'll certainly agree with the notion that we don't 
> necessarily need any integration of permissions/ownership into git at 
> all, and you can always do it as a totally independent layer.
> 
> > > Your choice. But I know which one I'd choose.
> > 
> > That's fine.  But you don't actually need git at all to implement your
> > choice, so this is orthogonal to whether having an option to do it
> > inside of git might be worth having.
> 
> But I care about git having a *sane*design*, whether I use all the 
> features or not. Because I simply care about my tools at a higher level 
> than most users do. Which means that it doesn't matter whether I'll use 
> permissions/ownership tracking or not - I still require that git do it 
> *sanely* from my standpoint of having a good content tracker.
> 
> And that means tracking those things *separately*, and not trying to mess 
> up the "tree" structure, for example.

Do you think its OK to cache this stuff in the index, though?
write-tree could then just dump the perms/ownership out as gitattributes
somewhere.

-JE

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: tracking perms/ownership
  2007-08-24 18:15                     ` Josh England
@ 2007-08-24 18:23                       ` Linus Torvalds
  2007-08-24 18:56                         ` Josh England
  2007-08-24 19:33                       ` Robin Rosenberg
  1 sibling, 1 reply; 41+ messages in thread
From: Linus Torvalds @ 2007-08-24 18:23 UTC (permalink / raw)
  To: Josh England; +Cc: David Kastrup, git



On Fri, 24 Aug 2007, Josh England wrote:
> 
> Do you think its OK to cache this stuff in the index, though?
> write-tree could then just dump the perms/ownership out as gitattributes
> somewhere.

I'd really prefer not.

The index state - very much by design - matches the filesystem "stat" 
data, not the internal git data. So "ce_size" matches the checked-out 
size, not the native git data size (ie with CRLF conversion, it matches 
not the checked-in data, but the filesystem version). 

And the same really goes for ce_uid/ce_gid: they have to match what's on 
the filesystem, because they are used not to track user information, but 
to verify that the inode data is valid!

Yeah, we could just ignore them for checking "is the inode the same", but 
that would actually end up *defeating* the point of what you want to do: 
at that point, we'd also obviously ignore it when ownership changes!

			Linus

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: tracking perms/ownership
  2007-08-24 18:23                       ` Linus Torvalds
@ 2007-08-24 18:56                         ` Josh England
  2007-08-24 20:37                           ` Junio C Hamano
  0 siblings, 1 reply; 41+ messages in thread
From: Josh England @ 2007-08-24 18:56 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: David Kastrup, git

On Fri, 2007-08-24 at 11:23 -0700, Linus Torvalds wrote:
> 
> On Fri, 24 Aug 2007, Josh England wrote:
> > 
> > Do you think its OK to cache this stuff in the index, though?
> > write-tree could then just dump the perms/ownership out as gitattributes
> > somewhere.
> 
> I'd really prefer not.
> 
> The index state - very much by design - matches the filesystem "stat" 
> data, not the internal git data. So "ce_size" matches the checked-out 
> size, not the native git data size (ie with CRLF conversion, it matches 
> not the checked-in data, but the filesystem version). 

That's exactly what I'm after, too: having a snapshot of all lstat data
in the index, because I don't want to have to do an extra stat
somewhere.

> And the same really goes for ce_uid/ce_gid: they have to match what's on 
> the filesystem, because they are used not to track user information, but 
> to verify that the inode data is valid!

But the stat data (even uid/gid) is in there nonetheless, right?  If
everything is in there already I wouldn't need to add a thing.  I just
want to access the index cache rather than hitting the filesystem
directly.

> Yeah, we could just ignore them for checking "is the inode the same", but 
> that would actually end up *defeating* the point of what you want to do: 
> at that point, we'd also obviously ignore it when ownership changes!

If we view the index as being a snapshot of the filesystem, and if
perm/ownership data is stored as .gitattributes in the actual repo, 
then the perm/ownership engine just has to reconcile between the index
and the .gitattributes file for both the read and write case.
Differences in the write case would result in the .gitattributes being
updated.  Differences in the read case would result in chown/chmod being
run in the working tree.  Does this make sense?

-JE

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: tracking perms/ownership
  2007-08-24 18:15                     ` Josh England
  2007-08-24 18:23                       ` Linus Torvalds
@ 2007-08-24 19:33                       ` Robin Rosenberg
  1 sibling, 0 replies; 41+ messages in thread
From: Robin Rosenberg @ 2007-08-24 19:33 UTC (permalink / raw)
  To: Josh England; +Cc: Linus Torvalds, David Kastrup, git

fredag 24 augusti 2007 skrev Josh England:
> punt  :)   Simple unix ownership and perms are a good first cut.  ACL's
> could probably be handled in much the same way, but converting between
> unix perms and ACLs might have to be a separate attribute/filter
> entirely.

You cannot convert between traditional unix permissisons and ACL:s. Either you
manage ACL:s or not. The traditional form is fortunately just a special case of posix
ACL:s. Here is a getfacl example. Just feed it to setfacl to set permissions according
to the dump.

$ getfacl README
# file: README
# owner: me
# group: me
user::rw-
group::r--
group:apache:rwx
mask::rwx
other::r--

Windows ACL.s are different though. 

-- robin

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: tracking perms/ownership
  2007-08-24 18:56                         ` Josh England
@ 2007-08-24 20:37                           ` Junio C Hamano
  2007-08-24 21:26                             ` Josh England
  0 siblings, 1 reply; 41+ messages in thread
From: Junio C Hamano @ 2007-08-24 20:37 UTC (permalink / raw)
  To: Josh England; +Cc: Linus Torvalds, David Kastrup, git

"Josh England" <jjengla@sandia.gov> writes:

> But the stat data (even uid/gid) is in there nonetheless, right?  If
> everything is in there already I wouldn't need to add a thing.  I just
> want to access the index cache rather than hitting the filesystem
> directly.

But to use that data you would need extra code to move things
from there to gitattributes, wouldn't you?  I can see that you
could "stage" change of ownership in the index and attempt to
commit by nonexisting

	git update-index --chown root foo.c

which would say "foo.c is now owned by uid #0", but before the
next git-commit-tree runs, somebody (namely, "git-commit") has
to run a possibly enhanced "git diff-files" (traditionally
uid/gid are NOT part of contents at all, so diff-files would not
say ownership has changed between the filesystem and index in
what way at all) to notice that ownership has changed, and
update .gitattributes.

Then you need to also "git update-index" the .gitattributes as
well, to record the ownership change in the commit.  What if the
user had unrelated changes that the user does not want to commit
in .gitattributes?

It will quickly become a mess.

It would rather be more effective for the user action "I want to
change the ownership of foo.c to root" to cause a direct
manipulation of .gitattributes file.  For this, we can add a
nice wrapper if there is a need, but the initial cut could be
just running "${EDITOR-${VISUAL-vi}} .gitattributes", nothing
more.

The user can say "git diff" to view .gitattributes changes, and
if that is what he wants (maybe he wants to do "git add -i" to
pick only the hunk about the ownership change for the next
commit), the change to .gitattributes can be committed.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: tracking perms/ownership [was: empty directories]
  2007-08-24 15:50             ` Josh England
@ 2007-08-24 20:58               ` Jeff King
  2007-08-25 14:31                 ` Johannes Schindelin
  0 siblings, 1 reply; 41+ messages in thread
From: Jeff King @ 2007-08-24 20:58 UTC (permalink / raw)
  To: Josh England; +Cc: Johannes Schindelin, Linus Torvalds, git

On Fri, Aug 24, 2007 at 09:50:32AM -0600, Josh England wrote:

> > direction and not at least consider people storing actual ACLs (or even
> > other extended attributes). An already-standard format like that
> 
> Do you mean using acls through contrib/hooks/update-paranoid?  That is
> the only place I see any mention of them.  clean and smudge seem out
> because they are passed blob objects and have no notion of pathname.  I
> don't see how to use this for automatic storing/restoring of
> perms/ownership.

No, I mean filesystem ACLs. Your complaint is that git stores only the
file _content_, not some specific metadata that you want (owner, group,
permissions). My point is that there is _other_ metadata, too (such as
POSIX ACLs) that could be stored. Even if you don't want to store them,
if you are extending git's capabilities, it makes sense to at least
consider how to handle those cases, too.

But yes, clean and smudge don't get the pathname. It would be a fairly
trivial patch, though, so maybe I'll play with it.

-Peff

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: tracking perms/ownership
  2007-08-24 20:37                           ` Junio C Hamano
@ 2007-08-24 21:26                             ` Josh England
  0 siblings, 0 replies; 41+ messages in thread
From: Josh England @ 2007-08-24 21:26 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, David Kastrup, git

On Fri, 2007-08-24 at 13:37 -0700, Junio C Hamano wrote:
> It would rather be more effective for the user action "I want to
> change the ownership of foo.c to root" to cause a direct
> manipulation of .gitattributes file.  For this, we can add a
> nice wrapper if there is a need, but the initial cut could be
> just running "${EDITOR-${VISUAL-vi}} .gitattributes", nothing
> more.
> 
> The user can say "git diff" to view .gitattributes changes, and
> if that is what he wants (maybe he wants to do "git add -i" to
> pick only the hunk about the ownership change for the next
> commit), the change to .gitattributes can be committed.

That sounds fine, and is certainly easier to implement.  The only catch
is that whatever wrapper is updating .gitattributes will have to walk
the working tree doing lstat() calls, which seems redundant (and costly)
to me.

-JE

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: tracking perms/ownership
  2007-08-24 17:51                   ` Linus Torvalds
  2007-08-24 18:15                     ` Josh England
@ 2007-08-24 21:30                     ` David Kastrup
  1 sibling, 0 replies; 41+ messages in thread
From: David Kastrup @ 2007-08-24 21:30 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Fri, 24 Aug 2007, David Kastrup wrote:
>
>> In your example, neither installed files nor ownership are tracked
>> in the filesystem.  Both are "tracked" in the Makefile.  Or rather
>> than being tracked, they are explicitly catered for by the user.
>
> And I seriously am saying that that is the only way to handle things
> sanely in a distributed content tracker like git.

Well, maybe _if_ you are using it as a distributed content tracker.
But git is excellent at tracking contents (and resolving conflicts and
merged) even if you _don't_ distribute.

Anyway, my beef with using something like .gitattributes or similar
for tracking permissions is twofold:

a) if I am tracking a directory, having to track additional files
clutters the directory.  So if one uses a separate file for tracking,
it should be able to use a file that is not actually in the work tree.
But it still needs to be versioned.  One could possibly fudge this by
creating an artificial work tree with the tracked directory being in a
subdirectory of it, but that's all pretty dorky.

b) merge resolution and movement tracking.  Delegating stuff to a file
and using the _file_ merging and tracking mechanisms is just not
really the same thing.  So it would be nice to at least have
"pluggable merge strategies" for particular files, or treat
gitattributes special with regard to merging, anyway.

Personally, I'm leaning towards a pluggable policy system containing
rules how permission information is represented textually in the
repository (that would allow acls and uid gid information), how the
index is updated from repo and workdir and vice versa.  The default
policy would just talk about 777 or 666 (or 775 and 644) as it does
now.

We already have a policy flag that optionally blocks the information
flow from/to the index regarding executable bits.  So it is not like
the concept is alien.

On the matter of taste: I feel fine about storing numerical uid/gid
data in the index, but I am already getting queasy with the idea of
storing them numerically in the repository: that's a place where I
find symbolic names more appropriate.

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: tracking perms/ownership [was: empty directories]
  2007-08-24 10:05           ` tracking perms/ownership [was: empty directories] Jeff King
@ 2007-08-25 14:30             ` Johannes Schindelin
  0 siblings, 0 replies; 41+ messages in thread
From: Johannes Schindelin @ 2007-08-25 14:30 UTC (permalink / raw)
  To: Jeff King; +Cc: Josh England, Linus Torvalds, git

Hi,

On Fri, 24 Aug 2007, Jeff King wrote:

> On Fri, Aug 24, 2007 at 11:38:13AM +0200, Johannes Schindelin wrote:
> 
> > I wonder why you do not just use the "smudge" and "clean" attributes, and 
> > store the ownership _and_ the permissions in .gitacls.
> 
> Thinking about this more, are you proposing:
> 
> 1. Clean and smudge every file, looking up the attributes in .gitacls.

Yes, this is what I thought about.  And I'd just have looked up the file 
name by sha1.

The clean/smudge filter would update .gitacls in the index, too...

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: tracking perms/ownership [was: empty directories]
  2007-08-24 20:58               ` Jeff King
@ 2007-08-25 14:31                 ` Johannes Schindelin
  2007-08-25 14:46                   ` tracking perms/ownership Junio C Hamano
  0 siblings, 1 reply; 41+ messages in thread
From: Johannes Schindelin @ 2007-08-25 14:31 UTC (permalink / raw)
  To: Jeff King; +Cc: Josh England, Linus Torvalds, git

Hi,

On Fri, 24 Aug 2007, Jeff King wrote:

> On Fri, Aug 24, 2007 at 09:50:32AM -0600, Josh England wrote:
> 
> > > direction and not at least consider people storing actual ACLs (or even
> > > other extended attributes). An already-standard format like that
> > 
> > Do you mean using acls through contrib/hooks/update-paranoid?  That is
> > the only place I see any mention of them.  clean and smudge seem out
> > because they are passed blob objects and have no notion of pathname.  I
> > don't see how to use this for automatic storing/restoring of
> > perms/ownership.
> 
> No, I mean filesystem ACLs. Your complaint is that git stores only the
> file _content_, not some specific metadata that you want (owner, group,
> permissions). My point is that there is _other_ metadata, too (such as
> POSIX ACLs) that could be stored. Even if you don't want to store them,
> if you are extending git's capabilities, it makes sense to at least
> consider how to handle those cases, too.
> 
> But yes, clean and smudge don't get the pathname. It would be a fairly
> trivial patch, though, so maybe I'll play with it.

Yes, please do.  Even if you do not end up implementing the perms/owner 
tracking using the clean/smudge filter, it seems odd that the filter 
should not get the filename.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: tracking perms/ownership
  2007-08-25 14:31                 ` Johannes Schindelin
@ 2007-08-25 14:46                   ` Junio C Hamano
  2007-08-25 19:35                     ` Junio C Hamano
  0 siblings, 1 reply; 41+ messages in thread
From: Junio C Hamano @ 2007-08-25 14:46 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Jeff King, Josh England, Linus Torvalds, git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Yes, please do.  Even if you do not end up implementing the perms/owner 
> tracking using the clean/smudge filter, it seems odd that the filter 
> should not get the filename.

Please don't.  Go back to the list discussion and recall why any
filters that depends on nothing but contents are bad ("crlf good,
keyword bad").  Don't feed paths to filters.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: tracking perms/ownership
  2007-08-25 14:46                   ` tracking perms/ownership Junio C Hamano
@ 2007-08-25 19:35                     ` Junio C Hamano
  0 siblings, 0 replies; 41+ messages in thread
From: Junio C Hamano @ 2007-08-25 19:35 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Jeff King, Josh England, Linus Torvalds, git

Junio C Hamano <gitster@pobox.com> writes:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
>> Yes, please do.  Even if you do not end up implementing the perms/owner 
>> tracking using the clean/smudge filter, it seems odd that the filter 
>> should not get the filename.
>
> Please don't.  Go back to the list discussion and recall why any
> filters that depends on nothing but contents are bad ("crlf good,
> keyword bad").  Don't feed paths to filters.

Gaaaaah.  I meant "a filter whose action depends on anything
other than contents".  Those other things include history (so
"commit ID" is out) and pathnames.

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2007-08-25 19:35 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-08-21 17:14 empty directories Josh England
2007-08-21 17:40 ` Sean
2007-08-22 21:25   ` Josh England
2007-08-22 23:25     ` Linus Torvalds
2007-08-22 23:55       ` David Kastrup
2007-08-23 15:24       ` Josh England
2007-08-23 21:51       ` tracking perms/ownership [was: empty directories] Josh England
2007-08-23 22:08         ` tracking perms/ownership Junio C Hamano
2007-08-23 23:30           ` Linus Torvalds
2007-08-24  6:16             ` David Kastrup
2007-08-24  6:37               ` Linus Torvalds
2007-08-24  7:38                 ` Josh England
2007-08-24  7:50                 ` David Kastrup
2007-08-24 17:51                   ` Linus Torvalds
2007-08-24 18:15                     ` Josh England
2007-08-24 18:23                       ` Linus Torvalds
2007-08-24 18:56                         ` Josh England
2007-08-24 20:37                           ` Junio C Hamano
2007-08-24 21:26                             ` Josh England
2007-08-24 19:33                       ` Robin Rosenberg
2007-08-24 21:30                     ` David Kastrup
2007-08-24  7:22           ` Josh England
2007-08-24  7:39             ` Junio C Hamano
2007-08-24  8:19               ` Josh England
2007-08-24 16:11           ` Josh England
2007-08-24 16:27             ` Josh England
2007-08-24  9:38         ` tracking perms/ownership [was: empty directories] Johannes Schindelin
2007-08-24  9:52           ` Jeff King
2007-08-24 15:50             ` Josh England
2007-08-24 20:58               ` Jeff King
2007-08-25 14:31                 ` Johannes Schindelin
2007-08-25 14:46                   ` tracking perms/ownership Junio C Hamano
2007-08-25 19:35                     ` Junio C Hamano
2007-08-24 10:05           ` tracking perms/ownership [was: empty directories] Jeff King
2007-08-25 14:30             ` Johannes Schindelin
2007-08-24 17:10       ` empty directories Jason Garber
2007-08-22 23:40     ` Jakub Narebski
2007-08-22  0:06 ` Jakub Narebski
2007-08-22  4:31 ` Salikh Zakirov
2007-08-22 18:46   ` Linus Torvalds
2007-08-22 19:12     ` David Kastrup

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).