git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Destructive side-effect of "cg-status"
@ 2005-09-30 16:03 Wolfgang Denk
  2005-10-01 10:24 ` Martin Langhoff
  2005-10-01 16:41 ` Linus Torvalds
  0 siblings, 2 replies; 18+ messages in thread
From: Wolfgang Denk @ 2005-09-30 16:03 UTC (permalink / raw)
  To: git

So far I  thought  "cg-status"  is  a  harmless  command  which  just
displays  some  status information. It ain't so. One of our engineers
reported a  corrupted  repository  after  I  ran  "cg-status"  in  his
directory:

$ cg-status
Heads:
   >master      805f93e4ca96d0c0cb2d2f9532d9666b22961e88
  R origin      805f93e4ca96d0c0cb2d2f9532d9666b22961e88

error: open failed
fatal: cache corrupted
error: open failed
? COPYING
? CREDITS
? Documentation/00-INDEX
? Documentation/BUG-HUNTING
...
error: open failed
read_cache: Permission denied
...
error: open failed
read_cache: Permission denied
...


As mentioned before,  all  I  did  was  running  "cg-status"  in  his
directory. Here is what happens:

Before:

	-> rpm -q cogito
	cogito-0.15.1-1
	-> id
	uid=500(wd) gid=500(wd) groups=200(gitmaster),400(denx),500(wd)
	-> umask
	0002
	-> ls -ld .git
	drwxrwxrwx  6 sr sr 80 Sep 30 17:49 .git
	-> ls -l .git/index
	-rw-r--r--  1 sr sr 1728032 Sep 30 17:17 .git/index

Then:

	-> cg-status
	Heads:
	   >master      805f93e4ca96d0c0cb2d2f9532d9666b22961e88
	  R origin      805f93e4ca96d0c0cb2d2f9532d9666b22961e88

	M arch/ppc/configs/bubinga_defconfig
	M arch/ppc/configs/walnut_defconfig
	-> ls -l .git/index
	-rw-------  1 wd wd 1728032 Sep 30 17:49 .git/index
	^^^^^^^^^^    ^^^^^

That means, that "cg-status" actually *rewrote* .git/index,  with  me
(wd)  as  new  owner, and - ignoring my umask - with permissions that
prevent the original owner (sr) to access the file!

Arghhhh!!!

Best regards,

Wolfgang Denk

-- 
Software Engineering:  Embedded and Realtime Systems,  Embedded Linux
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de
Generally speaking, there are other ways to accomplish whatever it is
that you think you need ...                               - Doug Gwyn

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Destructive side-effect of "cg-status"
  2005-09-30 16:03 Destructive side-effect of "cg-status" Wolfgang Denk
@ 2005-10-01 10:24 ` Martin Langhoff
  2005-10-01 16:41 ` Linus Torvalds
  1 sibling, 0 replies; 18+ messages in thread
From: Martin Langhoff @ 2005-10-01 10:24 UTC (permalink / raw)
  To: Wolfgang Denk; +Cc: git

On 10/1/05, Wolfgang Denk <wd@denx.de> wrote:
> So far I  thought  "cg-status"  is  a  harmless  command  which  just
> displays  some  status information. It ain't so. One of our engineers
> reported a  corrupted  repository  after  I  ran  "cg-status"  in  his
> directory:

Interesting... cg-status, and sometimes cg-diff, have to update the
index. The index is the trick behind git's performance (and some other
smarts). It's never really touched by Cogito, but by the git-*-index
commands.

Perhaps git-*-index commands should check ownership vs current uid and
print a warning?

cheers,


martin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Destructive side-effect of "cg-status"
  2005-09-30 16:03 Destructive side-effect of "cg-status" Wolfgang Denk
  2005-10-01 10:24 ` Martin Langhoff
@ 2005-10-01 16:41 ` Linus Torvalds
  2005-10-01 18:14   ` Junio C Hamano
  2005-10-01 19:42   ` Destructive side-effect of "cg-status" Wolfgang Denk
  1 sibling, 2 replies; 18+ messages in thread
From: Linus Torvalds @ 2005-10-01 16:41 UTC (permalink / raw)
  To: Wolfgang Denk; +Cc: Git Mailing List, Junio C Hamano



On Fri, 30 Sep 2005, Wolfgang Denk wrote:
>
> So far I  thought  "cg-status"  is  a  harmless  command  which  just
> displays  some  status information. It ain't so. One of our engineers
> reported a  corrupted  repository  after  I  ran  "cg-status"  in  his
> directory:

Well, it's not corrupted, but yes, the index file ends up unreadable.

> That means, that "cg-status" actually *rewrote* .git/index,  with  me
> (wd)  as  new  owner, and - ignoring my umask - with permissions that
> prevent the original owner (sr) to access the file!

The umask thing looks like a bug. Fixed thus.

Also, arguably we should try to avoid writing the index file when not 
necessary, although the fact is, that cg-status (and "git status") _do_ 
need to actually keep it up-to-date in order to do the right thing. Also 
true of some other programs that might otherwise appear to be read-only 
(ie I've considered doing the same thing for "git diff").

We used to have that optimization, but it was broken. I fixed it but 
disabled it for fear of other bugs.

But honoring umask would seem to be a no-brainer.

		Linus

----
diff --git a/index.c b/index.c
--- a/index.c
+++ b/index.c
@@ -29,7 +29,7 @@ int hold_index_file_for_update(struct ca
 		signal(SIGINT, remove_lock_file_on_signal);
 		atexit(remove_lock_file);
 	}
-	return open(cf->lockfile, O_RDWR | O_CREAT | O_EXCL, 0600);
+	return open(cf->lockfile, O_RDWR | O_CREAT | O_EXCL, 0666);
 }
 
 int commit_index_file(struct cache_file *cf)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Destructive side-effect of "cg-status"
  2005-10-01 16:41 ` Linus Torvalds
@ 2005-10-01 18:14   ` Junio C Hamano
  2005-10-01 19:07     ` Honor extractor's umask in git-tar-tree Junio C Hamano
  2005-10-01 19:42   ` Destructive side-effect of "cg-status" Wolfgang Denk
  1 sibling, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2005-10-01 18:14 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Wolfgang Denk, Git Mailing List

Linus Torvalds <torvalds@osdl.org> writes:

> -	return open(cf->lockfile, O_RDWR | O_CREAT | O_EXCL, 0600);
> +	return open(cf->lockfile, O_RDWR | O_CREAT | O_EXCL, 0666);

Good spotting - thanks.  We tried to use 0666/0777 everywhere
and let umask do its job, but this was one of the two places we
still had 0[67]00.  I'd do the same for the other 0600 in
mailsplit.c, not that I think it matters, but just for
consistency.

Unrelated to the topic at hand, but related to the mode bits --
tar-tree generates archives with 0644/0755 permission bits.  It
might not be a bad idea to just let the tar command honor umask
of the extracter, by storing 0666 and 0777 in the archive.

I always work in an environment where umask 002 is the norm, and
get irritated when upstream tarballs of other peoples' projects
create directories with mode 0755, making me do chmod 2775 on
them.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Honor extractor's umask in git-tar-tree.
  2005-10-01 18:14   ` Junio C Hamano
@ 2005-10-01 19:07     ` Junio C Hamano
  2005-10-02  3:24       ` H. Peter Anvin
  0 siblings, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2005-10-01 19:07 UTC (permalink / raw)
  To: git; +Cc: Rene Scharfe, Linus Torvalds

The archive generated with git-tar-tree had 0755 and 0644 mode bits.
This inconvenienced the extractor with umask 002 by robbing g+w bit
unconditionally.  Just write it out with loose permissions bits and
let the umask of the extractor do its job.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---
Junio C Hamano <junkio@cox.net> writes:

> Unrelated to the topic at hand, but related to the mode bits --
> tar-tree generates archives with 0644/0755 permission bits.  It
> might not be a bad idea to just let the tar command honor umask
> of the extracter, by storing 0666 and 0777 in the archive.
>
> I always work in an environment where umask 002 is the norm, and
> get irritated when upstream tarballs of other peoples' projects
> create directories with mode 0755, making me do chmod 2775 on
> them.

diff --git a/tar-tree.c b/tar-tree.c
--- a/tar-tree.c
+++ b/tar-tree.c
@@ -353,6 +353,7 @@ static void traverse_tree(void *buffer, 
 
 		if (size < namelen + 20 || sscanf(buffer, "%o", &mode) != 1)
 			die("corrupt 'tree' file");
+		mode |= (mode & 0100) ? 0777 : 0666;
 		buffer = sha1 + 20;
 		size -= namelen + 20;
 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Destructive side-effect of "cg-status"
  2005-10-01 16:41 ` Linus Torvalds
  2005-10-01 18:14   ` Junio C Hamano
@ 2005-10-01 19:42   ` Wolfgang Denk
  2005-10-01 20:24     ` Linus Torvalds
  1 sibling, 1 reply; 18+ messages in thread
From: Wolfgang Denk @ 2005-10-01 19:42 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List, Junio C Hamano

In message <Pine.LNX.4.64.0510010934290.3378@g5.osdl.org>
Linus Torvalds wrote:
> 
> Also, arguably we should try to avoid writing the index file when not 
> necessary, although the fact is, that cg-status (and "git status") _do_ 
> need to actually keep it up-to-date in order to do the right thing. Also 
> true of some other programs that might otherwise appear to be read-only 
> (ie I've considered doing the same thing for "git diff").

But shouldn't it be possible to run such  commands  as  "status"  and
"diff" in a repository for which I have only read permissions? Or how
can  I find out about the status of another user's repository without
actually modifying it?

Also, error reporting is IMHO  not  sufficient  and  misleading.  For
example:

	$ git status 2>&1 | less
	error: open failed
	#
	# Updated but not checked in:
	#   (will commit)
	#
	#       deleted:  CHANGELOG
	#       deleted:  COPYING
	#       deleted:  CREDITS
	#       deleted:  MAINTAINERS
	#       deleted:  MAKEALL
	#       deleted:  Makefile
	#       deleted:  README
	...
	[all files in the repository flagged as "deleted" !]
	#
	error: open failed
	read_cache: Permission denied

The "error: open failed" should at leats include the  file  name  and
the  errno/strerror  message.  Same  for  the "read_cache: Permission
denied" - of course, if you knot the git internals you will know what
this means, but the average user has no idea that he should check the
permissions of .git/index.

Finally, a thick fat warning should be  added  to  the  documentation
that  these  commands  actually (may) modify the repository. This was
totally unexpected for me.

Thanks.

Best regards,

Wolfgang Denk

-- 
Software Engineering:  Embedded and Realtime Systems,  Embedded Linux
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de
In an infinite universe all things are possible, including the possi-
bility that the universe does not exist.
                        - Terry Pratchett, _The Dark Side of the Sun_

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Destructive side-effect of "cg-status"
  2005-10-01 19:42   ` Destructive side-effect of "cg-status" Wolfgang Denk
@ 2005-10-01 20:24     ` Linus Torvalds
  0 siblings, 0 replies; 18+ messages in thread
From: Linus Torvalds @ 2005-10-01 20:24 UTC (permalink / raw)
  To: Wolfgang Denk; +Cc: Git Mailing List, Junio C Hamano



On Sat, 1 Oct 2005, Wolfgang Denk wrote:
> 
> The "error: open failed" should at leats include the  file  name  and
> the  errno/strerror  message.  Same  for  the "read_cache: Permission
> denied" - of course, if you knot the git internals you will know what
> this means, but the average user has no idea that he should check the
> permissions of .git/index.

Agreed.

Something like this would seem sane.

		Linus

---
Subject: Better error reporting for "git status"

Instead of "git status" ignoring (and hiding) potential errors from the 
"git-update-index" call, make it exit if it fails, and show the error.

In order to do this, use the "-q" flag (to ignore not-up-to-date files) 
and add a new "--unmerged" flag that allows unmerged entries in the index
without any errors.

This also avoids marking the index "changed" if an entry isn't actually 
modified, and makes sure that we exit with an understandable error message 
if the index is corrupt or unreadable. "read_cache()" no longer returns an 
error for the caller to check.

Finally, make die() and usage() exit with recognizable error codes, if we
ever want to check the failure reason in scripts.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
diff --git a/git-status.sh b/git-status.sh
--- a/git-status.sh
+++ b/git-status.sh
@@ -37,7 +37,7 @@ refs/heads/master) ;;
 *)	echo "# On branch $branch" ;;
 esac
 
-git-update-index --refresh >/dev/null 2>&1
+git-update-index -q --unmerged --refresh || exit
 
 if test -f "$GIT_DIR/HEAD"
 then
diff --git a/read-cache.c b/read-cache.c
--- a/read-cache.c
+++ b/read-cache.c
@@ -464,11 +464,15 @@ int read_cache(void)
 
 	errno = EBUSY;
 	if (active_cache)
-		return error("more than one cachefile");
+		return active_nr;
+
 	errno = ENOENT;
 	fd = open(get_index_file(), O_RDONLY);
-	if (fd < 0)
-		return (errno == ENOENT) ? 0 : error("open failed");
+	if (fd < 0) {
+		if (errno == ENOENT)
+			return 0;
+		die("index file open failed (%s)", strerror(errno));
+	}
 
 	size = 0; // avoid gcc warning
 	map = MAP_FAILED;
@@ -480,7 +484,7 @@ int read_cache(void)
 	}
 	close(fd);
 	if (map == MAP_FAILED)
-		return error("mmap failed");
+		die("index file mmap failed (%s)", strerror(errno));
 
 	hdr = map;
 	if (verify_hdr(hdr, size) < 0)
@@ -501,7 +505,7 @@ int read_cache(void)
 unmap:
 	munmap(map, size);
 	errno = EINVAL;
-	return error("verify header failed");
+	die("index file corrupt");
 }
 
 #define WRITE_BUFFER_SIZE 8192
diff --git a/update-index.c b/update-index.c
--- a/update-index.c
+++ b/update-index.c
@@ -13,7 +13,7 @@
  * like "git-update-index *" and suddenly having all the object
  * files be revision controlled.
  */
-static int allow_add = 0, allow_remove = 0, allow_replace = 0, not_new = 0, quiet = 0, info_only = 0;
+static int allow_add = 0, allow_remove = 0, allow_replace = 0, allow_unmerged = 0, not_new = 0, quiet = 0, info_only = 0;
 static int force_remove;
 
 /* Three functions to allow overloaded pointer return; see linux/err.h */
@@ -135,7 +135,7 @@ static struct cache_entry *refresh_entry
 
 	changed = ce_match_stat(ce, &st);
 	if (!changed)
-		return ce;
+		return NULL;
 
 	if (ce_modified(ce, &st))
 		return ERR_PTR(-EINVAL);
@@ -156,16 +156,20 @@ static int refresh_cache(void)
 		struct cache_entry *ce, *new;
 		ce = active_cache[i];
 		if (ce_stage(ce)) {
-			printf("%s: needs merge\n", ce->name);
-			has_errors = 1;
 			while ((i < active_nr) &&
 			       ! strcmp(active_cache[i]->name, ce->name))
 				i++;
 			i--;
+			if (allow_unmerged)
+				continue;
+			printf("%s: needs merge\n", ce->name);
+			has_errors = 1;
 			continue;
 		}
 
 		new = refresh_entry(ce);
+		if (!new)
+			continue;
 		if (IS_ERR(new)) {
 			if (not_new && PTR_ERR(new) == -ENOENT)
 				continue;
@@ -335,6 +339,10 @@ int main(int argc, const char **argv)
 				allow_remove = 1;
 				continue;
 			}
+			if (!strcmp(path, "--unmerged")) {
+				allow_unmerged = 1;
+				continue;
+			}
 			if (!strcmp(path, "--refresh")) {
 				has_errors |= refresh_cache();
 				continue;
diff --git a/usage.c b/usage.c
--- a/usage.c
+++ b/usage.c
@@ -15,7 +15,7 @@ static void report(const char *prefix, c
 void usage(const char *err)
 {
 	fprintf(stderr, "usage: %s\n", err);
-	exit(1);
+	exit(129);
 }
 
 void die(const char *err, ...)
@@ -25,7 +25,7 @@ void die(const char *err, ...)
 	va_start(params, err);
 	report("fatal: ", err, params);
 	va_end(params);
-	exit(1);
+	exit(128);
 }
 
 int error(const char *err, ...)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Honor extractor's umask in git-tar-tree.
  2005-10-01 19:07     ` Honor extractor's umask in git-tar-tree Junio C Hamano
@ 2005-10-02  3:24       ` H. Peter Anvin
  2005-10-02  9:55         ` Matthias Urlichs
  0 siblings, 1 reply; 18+ messages in thread
From: H. Peter Anvin @ 2005-10-02  3:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Rene Scharfe, Linus Torvalds

Junio C Hamano wrote:
> The archive generated with git-tar-tree had 0755 and 0644 mode bits.
> This inconvenienced the extractor with umask 002 by robbing g+w bit
> unconditionally.  Just write it out with loose permissions bits and
> let the umask of the extractor do its job.

I've thought that it would be nice if the files/directories were written 
into the archive with 0666/0777 permissions by default, and then 
extracted with the umask honoured.  A special option then could be used 
to add files with special permissions, like files in .ssh, which *have* 
to be g-w or sshd will reject them.

	-hpa

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Honor extractor's umask in git-tar-tree.
  2005-10-02  3:24       ` H. Peter Anvin
@ 2005-10-02  9:55         ` Matthias Urlichs
  2005-10-03  4:44           ` H. Peter Anvin
  0 siblings, 1 reply; 18+ messages in thread
From: Matthias Urlichs @ 2005-10-02  9:55 UTC (permalink / raw)
  To: git

Hi, H. Peter Anvin wrote:

> I've thought that it would be nice if the files/directories were written
> into the archive with 0666/0777 permissions by default, and then
> extracted with the umask honoured.

The git archive oesn't *have* permissions, just one "execute" bit.

>  A special option then could be used
> to add files with special permissions, like files in .ssh, which *have*
> to be g-w or sshd will reject them.
> 
I'd include a script in the archive which you'd run afterwards to fix
problems like this. IMHO, in most situations you'll need it anyway
(for instance, to re-start services).

-- 
Matthias Urlichs   |   {M:U} IT Design @ m-u-it.de   |  smurf@smurf.noris.de
Disclaimer: The quote was selected randomly. Really. | http://smurf.noris.de
 - -
As I approached the intersection a sign suddenly appeared in a place
where no stop sign had ever appeared before. I was unable to stop in
time to avoid the accident. To avoid hitting the bumper of the car in
front, I struck the pedestrian.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Honor extractor's umask in git-tar-tree.
  2005-10-02  9:55         ` Matthias Urlichs
@ 2005-10-03  4:44           ` H. Peter Anvin
  2005-10-03  5:10             ` Junio C Hamano
  0 siblings, 1 reply; 18+ messages in thread
From: H. Peter Anvin @ 2005-10-03  4:44 UTC (permalink / raw)
  To: Matthias Urlichs; +Cc: git

Matthias Urlichs wrote:
> Hi, H. Peter Anvin wrote:
> 
>>I've thought that it would be nice if the files/directories were written
>>into the archive with 0666/0777 permissions by default, and then
>>extracted with the umask honoured.
> 
> The git archive oesn't *have* permissions, just one "execute" bit.
> 

My point is that I believe it should.  It has the bitfield for it, it 
just doesn't use it at the moment.

>> A special option then could be used
>>to add files with special permissions, like files in .ssh, which *have*
>>to be g-w or sshd will reject them.
> 
> I'd include a script in the archive which you'd run afterwards to fix
> problems like this. IMHO, in most situations you'll need it anyway
> (for instance, to re-start services).

That is true in some cases, but I highly disagree with the statement "most".

	-hpa

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Honor extractor's umask in git-tar-tree.
  2005-10-03  4:44           ` H. Peter Anvin
@ 2005-10-03  5:10             ` Junio C Hamano
  2005-10-03 16:30               ` H. Peter Anvin
  0 siblings, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2005-10-03  5:10 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: git

"H. Peter Anvin" <hpa@zytor.com> writes:

> My point is that I believe it should.  It has the bitfield for it, it 
> just doesn't use it at the moment.

It is a bit more complicated than that.

Long time ago, we used to store the full permission bits and
ended up storing files in 0644 and 0664 modes, depending on who
is writing the tree object.  People with umask 022 checked out
from a tree that recorded blobs with 0664 bits and ended up
getting "mode changed" diff all the time, which was unacceptable
from the SCM point of view.  We _could_ have really changed the
mode bits representation in the tree objects back then to have
type + executable bit, but to preserve backward compatibility,
we chose to keep the bitfield layout and changed the code to
treat 1006xx and 1007yy in older trees to be equivalent to
100644 and 100755.  These days, for newly written tree objects,
above xx and yy 6-bit fields are "Must Be 4" and "Must Be 5"
fields, respectively, not bitfields to store arbitrary group and
other permission information.  git-fsck-objects even complains
about them.

So in that sense, it does _not_ have the bitfield for it, and
obviously we cannot use what we do not have.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Honor extractor's umask in git-tar-tree.
  2005-10-03  5:10             ` Junio C Hamano
@ 2005-10-03 16:30               ` H. Peter Anvin
  2005-10-03 17:18                 ` Junio C Hamano
  2005-10-03 17:45                 ` Linus Torvalds
  0 siblings, 2 replies; 18+ messages in thread
From: H. Peter Anvin @ 2005-10-03 16:30 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Junio C Hamano wrote:
> "H. Peter Anvin" <hpa@zytor.com> writes:
> 
> 
>>My point is that I believe it should.  It has the bitfield for it, it 
>>just doesn't use it at the moment.
> 
> 
> It is a bit more complicated than that.
> 
> Long time ago, we used to store the full permission bits and
> ended up storing files in 0644 and 0664 modes, depending on who
> is writing the tree object.  People with umask 022 checked out
> from a tree that recorded blobs with 0664 bits and ended up
> getting "mode changed" diff all the time, which was unacceptable
> from the SCM point of view.  We _could_ have really changed the
> mode bits representation in the tree objects back then to have
> type + executable bit, but to preserve backward compatibility,
> we chose to keep the bitfield layout and changed the code to
> treat 1006xx and 1007yy in older trees to be equivalent to
> 100644 and 100755.  These days, for newly written tree objects,
> above xx and yy 6-bit fields are "Must Be 4" and "Must Be 5"
> fields, respectively, not bitfields to store arbitrary group and
> other permission information.  git-fsck-objects even complains
> about them.
> 
> So in that sense, it does _not_ have the bitfield for it, and
> obviously we cannot use what we do not have.
> 

Welcome to the wonderful world of evolving file formats.

As you stated above, we currently use this field in a very inefficient 
manner, because of old mistakes.  There are several ways to recover from 
here, some of which are more complex than others.

In the case of git, there isn't just the requirement to maintain old 
formats indefinitely (due to the cryptographic chain), but also that new 
objects that are compatible with old format should be written in the old 
format to maintain the aliasing properties.  These are obstacles that 
are perfectly possible to overcome, although it takes a bit of legwork.

If the old-format (with random write bits) is out of circulation -- 
which I can't tell for sure they it is, but Linus' kernel tree doesn't 
seem to have any of these objects -- then the answer is very simple: 
redefine this field _a posteori_ to be the mode ^ 022 (or perhaps more 
sanely,
mode ^ (mode & 0200 ? 022 : 0)).  Compatibility and contents is fully 
preserved.  No problem.

If there are still old-format trees in circulation and compatibility 
with these very old trees need to be maintained, then it's a bit more 
complicated, but literally just a bit.  This data is already stored in 
text form in the object store, so there aren't any funnies with 
expanding it.  For example, encode a leading + on the octal value if 
this is a value with "I really mean it" permissions (and *only* those 
values.)  This is even readable by older versions of git, since they 
will just blindly ignore the + sign.

	-hpa

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Honor extractor's umask in git-tar-tree.
  2005-10-03 16:30               ` H. Peter Anvin
@ 2005-10-03 17:18                 ` Junio C Hamano
  2005-10-03 17:28                   ` H. Peter Anvin
  2005-10-03 17:45                 ` Linus Torvalds
  1 sibling, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2005-10-03 17:18 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: git

"H. Peter Anvin" <hpa@zytor.com> writes:

> As you stated above, we currently use this field in a very inefficient 
> manner, because of old mistakes.  There are several ways to recover from 
> here, some of which are more complex than others.

Solution for in-tree permission mode bits you outlined looked
fine (I'll have to re-read the part about "mode xor (mode &
022)..." part later, though).

For in-cache permission mode bits, we would probably need
something like this:

  * git-update-index will pick up the filesystem bits with the
    current semantics (i.e. look only at (mode & 0100) and
    force 0644 or 0755) by default; --full-perm-bits option
    would bypass this bits munging.

    Once a file is added with --full-perm-bits, it might be
    nice if index file remembers to pick up the full bits next
    time git-update-index is run on the path.  This could be
    achieved by saying that anything stored in the cache with
    non 100644, 100755 nor 120000 bits are such paths without
    having to change the index file format.

  * there are bunch of codes that assume 0644 and 0755 are the
    norm but also know that there are ancient trees that have
    0664 and 0775 and try to treat them equivalently.  They need
    to be selectively neutered; this applies to in-tree
    permission bits as well.

    git-read-tree will read permission mode bits from tree
    object as-is.  I.e. you will get 0644 and 0755 in cache from
    the existing tree objects.  When you check things out with
    002 umask, you will get 0664 and 0775 on the filesystem.  We
    do not want to consider this "mode changed by the user".

    git-update-index --refresh code should not be mode neutered
    to prevent this.  The same thing goes for diff.  These
    currently canonicalize mode bits by looking at (mode &
    0100), but should be changed to do so only when index has
    already the canonical mode bits, or something like that.

  * git-write-tree and git-fsck-objects probably has code to
    reject and correct abnormal mode bits.  They need to be
    neutered.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Honor extractor's umask in git-tar-tree.
  2005-10-03 17:18                 ` Junio C Hamano
@ 2005-10-03 17:28                   ` H. Peter Anvin
  0 siblings, 0 replies; 18+ messages in thread
From: H. Peter Anvin @ 2005-10-03 17:28 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Junio C Hamano wrote:
> 
> For in-cache permission mode bits, we would probably need
> something like this:
> 
>   * git-update-index will pick up the filesystem bits with the
>     current semantics (i.e. look only at (mode & 0100) and
>     force 0644 or 0755) by default; --full-perm-bits option
>     would bypass this bits munging.
> 
>     Once a file is added with --full-perm-bits, it might be
>     nice if index file remembers to pick up the full bits next
>     time git-update-index is run on the path.  This could be
>     achieved by saying that anything stored in the cache with
>     non 100644, 100755 nor 120000 bits are such paths without
>     having to change the index file format.
> 

One could also say that since oddball permissions are an exception, not 
the rule, that one should use a "git-chmod" command to enter the 
permissions in the cache.  The correct answer is probably *both* that 
and --full-permissions (or whatever) since they both probably apply to 
different workflows.

	-hpa

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Honor extractor's umask in git-tar-tree.
  2005-10-03 16:30               ` H. Peter Anvin
  2005-10-03 17:18                 ` Junio C Hamano
@ 2005-10-03 17:45                 ` Linus Torvalds
  2005-10-03 18:05                   ` H. Peter Anvin
  1 sibling, 1 reply; 18+ messages in thread
From: Linus Torvalds @ 2005-10-03 17:45 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Junio C Hamano, git



On Mon, 3 Oct 2005, H. Peter Anvin wrote:
> 
> If the old-format (with random write bits) is out of circulation -- which I
> can't tell for sure they it is, but Linus' kernel tree doesn't seem to have
> any of these objects

Oh, it does.

Run "git-fsck-cache --full --strict", and you'll get several trees that 
the strict checker marks as bad. I think it's mostly all one entry, namely 
arch/i386/kernel/vsyscall-note.S being marked 0664.

Git itself has even more of them - the kernel actually has fewer, because 
most work was done with a consistent umask of 022 (ie mostly mine), and by 
the time others started using git actively, we'd already changed the git 
rules.

However, there's nothing that says that we couldn't use one more bit in 
the "mode" flag to just say "this is an _exact_ mode, please preserve it". 
A kind of "sticky mode" for git. We've got _bits_ plenty: it's an ASCII 
text-mode representation in the trees (infinite bits), and even in the 
index it's a 32-bit thing that we only use 12 bits of (9 bits for 
permissions, 3 bits for the sparsely represented directory/symlink/regular 
file)

We'd have to be a bit careful to preserve that bit when doing an index 
refresh, but it's really not very hard. The hardest part is actually doing 
so for directories, since we don't keep the directories in the index at 
_all_.

But the fact is, it wouldn't solve the git-tar-tree thing. We can 
_represent_ exact masks, but we don't _want_ to, because normally it just 
leads to horrible problems with different people having different umasks. 
So in order to avoid having mode change merges, we'd _still_ have to make 
the current "0666/0777 + umask" be the normal one, and you'd use this 
"exact mode" thing only for very special cases (ie for backing up your 
home directory or similar, _not: for a distributed SCM).

As to tar: I think the current

        if (S_ISDIR(mode) || S_ISREG(mode))
                mode |= (mode & 0100) ? 0777 : 0666;

is wrong. It makes things world-writable by default, and that's just 
dangerous. "tar" normally won't apply umask when untarring (there's a flag 
for it, but I have never ever used it myself, and I doubt anybody else 
really does either - it's called "--no-same-permissions" in GNU tar).

I think a "0775" or "0664" might be acceptable (an umask of 002 is at 
least _normal_), but I suspect 0755/0644 is really better. Doing a simple 

	chmod -R +w

afterwards is better (and takes umask into account) than a "chmod -R o-w", 
since the latter leaves the tree writable for a while.

Ie default permissions are better off being too strict than too lax. Basic 
security.

Of course, if we were to add the "exact mode" bit, then git-tar-tree 
should obviously honor that for any files that have that bit set.

		Linus

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Honor extractor's umask in git-tar-tree.
  2005-10-03 17:45                 ` Linus Torvalds
@ 2005-10-03 18:05                   ` H. Peter Anvin
  2005-10-03 18:18                     ` Linus Torvalds
  0 siblings, 1 reply; 18+ messages in thread
From: H. Peter Anvin @ 2005-10-03 18:05 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git

Linus Torvalds wrote:
> 
> As to tar: I think the current
> 
>         if (S_ISDIR(mode) || S_ISREG(mode))
>                 mode |= (mode & 0100) ? 0777 : 0666;
> 
> is wrong. It makes things world-writable by default, and that's just 
> dangerous.

That's standard in the Unix world, though; of course, the user's umask 
shouldn't be set to zero unless things are in very special 
circumstances.  In the case of tar, the umask is applied on extraction 
unless the user explicitly specifies -p.

	-hpa

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Honor extractor's umask in git-tar-tree.
  2005-10-03 18:05                   ` H. Peter Anvin
@ 2005-10-03 18:18                     ` Linus Torvalds
  2005-10-03 18:32                       ` H. Peter Anvin
  0 siblings, 1 reply; 18+ messages in thread
From: Linus Torvalds @ 2005-10-03 18:18 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Junio C Hamano, git



On Mon, 3 Oct 2005, H. Peter Anvin wrote:
>
> Linus Torvalds wrote:
> > 
> > As to tar: I think the current
> > 
> >         if (S_ISDIR(mode) || S_ISREG(mode))
> >                 mode |= (mode & 0100) ? 0777 : 0666;
> > 
> > is wrong. It makes things world-writable by default, and that's just
> > dangerous.
> 
> That's standard in the Unix world, though; of course, the user's umask
> shouldn't be set to zero unless things are in very special circumstances.  In
> the case of tar, the umask is applied on extraction unless the user explicitly
> specifies -p.

Is it? The only place umask is mentioned in the man-page is

       --no-same-permissions
              apply  user?s  umask  when  extracting files instead of recorded
              permissions

but if tar really does honor umask, then hey, that 0777/0666 is fine.

(Ahh, googling a bit more, it appears that "-p" is default for root, which 
explains why you'd need the "anti-flag").

		Linus

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Honor extractor's umask in git-tar-tree.
  2005-10-03 18:18                     ` Linus Torvalds
@ 2005-10-03 18:32                       ` H. Peter Anvin
  0 siblings, 0 replies; 18+ messages in thread
From: H. Peter Anvin @ 2005-10-03 18:32 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git

Linus Torvalds wrote:
> 
> Is it? The only place umask is mentioned in the man-page is
> 
>        --no-same-permissions
>               apply  user?s  umask  when  extracting files instead of recorded
>               permissions
> 
> but if tar really does honor umask, then hey, that 0777/0666 is fine.
> 
> (Ahh, googling a bit more, it appears that "-p" is default for root, which 
> explains why you'd need the "anti-flag").
> 

Yeah, that assymetry is rather unfortunate.

	-hpa

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2005-10-03 18:32 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-09-30 16:03 Destructive side-effect of "cg-status" Wolfgang Denk
2005-10-01 10:24 ` Martin Langhoff
2005-10-01 16:41 ` Linus Torvalds
2005-10-01 18:14   ` Junio C Hamano
2005-10-01 19:07     ` Honor extractor's umask in git-tar-tree Junio C Hamano
2005-10-02  3:24       ` H. Peter Anvin
2005-10-02  9:55         ` Matthias Urlichs
2005-10-03  4:44           ` H. Peter Anvin
2005-10-03  5:10             ` Junio C Hamano
2005-10-03 16:30               ` H. Peter Anvin
2005-10-03 17:18                 ` Junio C Hamano
2005-10-03 17:28                   ` H. Peter Anvin
2005-10-03 17:45                 ` Linus Torvalds
2005-10-03 18:05                   ` H. Peter Anvin
2005-10-03 18:18                     ` Linus Torvalds
2005-10-03 18:32                       ` H. Peter Anvin
2005-10-01 19:42   ` Destructive side-effect of "cg-status" Wolfgang Denk
2005-10-01 20:24     ` Linus Torvalds

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).