* Destructive side-effect of "cg-status"
@ 2005-09-30 16:03 Wolfgang Denk
2005-10-01 10:24 ` Martin Langhoff
2005-10-01 16:41 ` Linus Torvalds
0 siblings, 2 replies; 18+ messages in thread
From: Wolfgang Denk @ 2005-09-30 16:03 UTC (permalink / raw)
To: git
So far I thought "cg-status" is a harmless command which just
displays some status information. It ain't so. One of our engineers
reported a corrupted repository after I ran "cg-status" in his
directory:
$ cg-status
Heads:
>master 805f93e4ca96d0c0cb2d2f9532d9666b22961e88
R origin 805f93e4ca96d0c0cb2d2f9532d9666b22961e88
error: open failed
fatal: cache corrupted
error: open failed
? COPYING
? CREDITS
? Documentation/00-INDEX
? Documentation/BUG-HUNTING
...
error: open failed
read_cache: Permission denied
...
error: open failed
read_cache: Permission denied
...
As mentioned before, all I did was running "cg-status" in his
directory. Here is what happens:
Before:
-> rpm -q cogito
cogito-0.15.1-1
-> id
uid=500(wd) gid=500(wd) groups=200(gitmaster),400(denx),500(wd)
-> umask
0002
-> ls -ld .git
drwxrwxrwx 6 sr sr 80 Sep 30 17:49 .git
-> ls -l .git/index
-rw-r--r-- 1 sr sr 1728032 Sep 30 17:17 .git/index
Then:
-> cg-status
Heads:
>master 805f93e4ca96d0c0cb2d2f9532d9666b22961e88
R origin 805f93e4ca96d0c0cb2d2f9532d9666b22961e88
M arch/ppc/configs/bubinga_defconfig
M arch/ppc/configs/walnut_defconfig
-> ls -l .git/index
-rw------- 1 wd wd 1728032 Sep 30 17:49 .git/index
^^^^^^^^^^ ^^^^^
That means, that "cg-status" actually *rewrote* .git/index, with me
(wd) as new owner, and - ignoring my umask - with permissions that
prevent the original owner (sr) to access the file!
Arghhhh!!!
Best regards,
Wolfgang Denk
--
Software Engineering: Embedded and Realtime Systems, Embedded Linux
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de
Generally speaking, there are other ways to accomplish whatever it is
that you think you need ... - Doug Gwyn
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Destructive side-effect of "cg-status"
2005-09-30 16:03 Destructive side-effect of "cg-status" Wolfgang Denk
@ 2005-10-01 10:24 ` Martin Langhoff
2005-10-01 16:41 ` Linus Torvalds
1 sibling, 0 replies; 18+ messages in thread
From: Martin Langhoff @ 2005-10-01 10:24 UTC (permalink / raw)
To: Wolfgang Denk; +Cc: git
On 10/1/05, Wolfgang Denk <wd@denx.de> wrote:
> So far I thought "cg-status" is a harmless command which just
> displays some status information. It ain't so. One of our engineers
> reported a corrupted repository after I ran "cg-status" in his
> directory:
Interesting... cg-status, and sometimes cg-diff, have to update the
index. The index is the trick behind git's performance (and some other
smarts). It's never really touched by Cogito, but by the git-*-index
commands.
Perhaps git-*-index commands should check ownership vs current uid and
print a warning?
cheers,
martin
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Destructive side-effect of "cg-status"
2005-09-30 16:03 Destructive side-effect of "cg-status" Wolfgang Denk
2005-10-01 10:24 ` Martin Langhoff
@ 2005-10-01 16:41 ` Linus Torvalds
2005-10-01 18:14 ` Junio C Hamano
2005-10-01 19:42 ` Destructive side-effect of "cg-status" Wolfgang Denk
1 sibling, 2 replies; 18+ messages in thread
From: Linus Torvalds @ 2005-10-01 16:41 UTC (permalink / raw)
To: Wolfgang Denk; +Cc: Git Mailing List, Junio C Hamano
On Fri, 30 Sep 2005, Wolfgang Denk wrote:
>
> So far I thought "cg-status" is a harmless command which just
> displays some status information. It ain't so. One of our engineers
> reported a corrupted repository after I ran "cg-status" in his
> directory:
Well, it's not corrupted, but yes, the index file ends up unreadable.
> That means, that "cg-status" actually *rewrote* .git/index, with me
> (wd) as new owner, and - ignoring my umask - with permissions that
> prevent the original owner (sr) to access the file!
The umask thing looks like a bug. Fixed thus.
Also, arguably we should try to avoid writing the index file when not
necessary, although the fact is, that cg-status (and "git status") _do_
need to actually keep it up-to-date in order to do the right thing. Also
true of some other programs that might otherwise appear to be read-only
(ie I've considered doing the same thing for "git diff").
We used to have that optimization, but it was broken. I fixed it but
disabled it for fear of other bugs.
But honoring umask would seem to be a no-brainer.
Linus
----
diff --git a/index.c b/index.c
--- a/index.c
+++ b/index.c
@@ -29,7 +29,7 @@ int hold_index_file_for_update(struct ca
signal(SIGINT, remove_lock_file_on_signal);
atexit(remove_lock_file);
}
- return open(cf->lockfile, O_RDWR | O_CREAT | O_EXCL, 0600);
+ return open(cf->lockfile, O_RDWR | O_CREAT | O_EXCL, 0666);
}
int commit_index_file(struct cache_file *cf)
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Destructive side-effect of "cg-status"
2005-10-01 16:41 ` Linus Torvalds
@ 2005-10-01 18:14 ` Junio C Hamano
2005-10-01 19:07 ` Honor extractor's umask in git-tar-tree Junio C Hamano
2005-10-01 19:42 ` Destructive side-effect of "cg-status" Wolfgang Denk
1 sibling, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2005-10-01 18:14 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Wolfgang Denk, Git Mailing List
Linus Torvalds <torvalds@osdl.org> writes:
> - return open(cf->lockfile, O_RDWR | O_CREAT | O_EXCL, 0600);
> + return open(cf->lockfile, O_RDWR | O_CREAT | O_EXCL, 0666);
Good spotting - thanks. We tried to use 0666/0777 everywhere
and let umask do its job, but this was one of the two places we
still had 0[67]00. I'd do the same for the other 0600 in
mailsplit.c, not that I think it matters, but just for
consistency.
Unrelated to the topic at hand, but related to the mode bits --
tar-tree generates archives with 0644/0755 permission bits. It
might not be a bad idea to just let the tar command honor umask
of the extracter, by storing 0666 and 0777 in the archive.
I always work in an environment where umask 002 is the norm, and
get irritated when upstream tarballs of other peoples' projects
create directories with mode 0755, making me do chmod 2775 on
them.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Honor extractor's umask in git-tar-tree.
2005-10-01 18:14 ` Junio C Hamano
@ 2005-10-01 19:07 ` Junio C Hamano
2005-10-02 3:24 ` H. Peter Anvin
0 siblings, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2005-10-01 19:07 UTC (permalink / raw)
To: git; +Cc: Rene Scharfe, Linus Torvalds
The archive generated with git-tar-tree had 0755 and 0644 mode bits.
This inconvenienced the extractor with umask 002 by robbing g+w bit
unconditionally. Just write it out with loose permissions bits and
let the umask of the extractor do its job.
Signed-off-by: Junio C Hamano <junkio@cox.net>
---
Junio C Hamano <junkio@cox.net> writes:
> Unrelated to the topic at hand, but related to the mode bits --
> tar-tree generates archives with 0644/0755 permission bits. It
> might not be a bad idea to just let the tar command honor umask
> of the extracter, by storing 0666 and 0777 in the archive.
>
> I always work in an environment where umask 002 is the norm, and
> get irritated when upstream tarballs of other peoples' projects
> create directories with mode 0755, making me do chmod 2775 on
> them.
diff --git a/tar-tree.c b/tar-tree.c
--- a/tar-tree.c
+++ b/tar-tree.c
@@ -353,6 +353,7 @@ static void traverse_tree(void *buffer,
if (size < namelen + 20 || sscanf(buffer, "%o", &mode) != 1)
die("corrupt 'tree' file");
+ mode |= (mode & 0100) ? 0777 : 0666;
buffer = sha1 + 20;
size -= namelen + 20;
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Destructive side-effect of "cg-status"
2005-10-01 16:41 ` Linus Torvalds
2005-10-01 18:14 ` Junio C Hamano
@ 2005-10-01 19:42 ` Wolfgang Denk
2005-10-01 20:24 ` Linus Torvalds
1 sibling, 1 reply; 18+ messages in thread
From: Wolfgang Denk @ 2005-10-01 19:42 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Git Mailing List, Junio C Hamano
In message <Pine.LNX.4.64.0510010934290.3378@g5.osdl.org>
Linus Torvalds wrote:
>
> Also, arguably we should try to avoid writing the index file when not
> necessary, although the fact is, that cg-status (and "git status") _do_
> need to actually keep it up-to-date in order to do the right thing. Also
> true of some other programs that might otherwise appear to be read-only
> (ie I've considered doing the same thing for "git diff").
But shouldn't it be possible to run such commands as "status" and
"diff" in a repository for which I have only read permissions? Or how
can I find out about the status of another user's repository without
actually modifying it?
Also, error reporting is IMHO not sufficient and misleading. For
example:
$ git status 2>&1 | less
error: open failed
#
# Updated but not checked in:
# (will commit)
#
# deleted: CHANGELOG
# deleted: COPYING
# deleted: CREDITS
# deleted: MAINTAINERS
# deleted: MAKEALL
# deleted: Makefile
# deleted: README
...
[all files in the repository flagged as "deleted" !]
#
error: open failed
read_cache: Permission denied
The "error: open failed" should at leats include the file name and
the errno/strerror message. Same for the "read_cache: Permission
denied" - of course, if you knot the git internals you will know what
this means, but the average user has no idea that he should check the
permissions of .git/index.
Finally, a thick fat warning should be added to the documentation
that these commands actually (may) modify the repository. This was
totally unexpected for me.
Thanks.
Best regards,
Wolfgang Denk
--
Software Engineering: Embedded and Realtime Systems, Embedded Linux
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de
In an infinite universe all things are possible, including the possi-
bility that the universe does not exist.
- Terry Pratchett, _The Dark Side of the Sun_
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Destructive side-effect of "cg-status"
2005-10-01 19:42 ` Destructive side-effect of "cg-status" Wolfgang Denk
@ 2005-10-01 20:24 ` Linus Torvalds
0 siblings, 0 replies; 18+ messages in thread
From: Linus Torvalds @ 2005-10-01 20:24 UTC (permalink / raw)
To: Wolfgang Denk; +Cc: Git Mailing List, Junio C Hamano
On Sat, 1 Oct 2005, Wolfgang Denk wrote:
>
> The "error: open failed" should at leats include the file name and
> the errno/strerror message. Same for the "read_cache: Permission
> denied" - of course, if you knot the git internals you will know what
> this means, but the average user has no idea that he should check the
> permissions of .git/index.
Agreed.
Something like this would seem sane.
Linus
---
Subject: Better error reporting for "git status"
Instead of "git status" ignoring (and hiding) potential errors from the
"git-update-index" call, make it exit if it fails, and show the error.
In order to do this, use the "-q" flag (to ignore not-up-to-date files)
and add a new "--unmerged" flag that allows unmerged entries in the index
without any errors.
This also avoids marking the index "changed" if an entry isn't actually
modified, and makes sure that we exit with an understandable error message
if the index is corrupt or unreadable. "read_cache()" no longer returns an
error for the caller to check.
Finally, make die() and usage() exit with recognizable error codes, if we
ever want to check the failure reason in scripts.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
diff --git a/git-status.sh b/git-status.sh
--- a/git-status.sh
+++ b/git-status.sh
@@ -37,7 +37,7 @@ refs/heads/master) ;;
*) echo "# On branch $branch" ;;
esac
-git-update-index --refresh >/dev/null 2>&1
+git-update-index -q --unmerged --refresh || exit
if test -f "$GIT_DIR/HEAD"
then
diff --git a/read-cache.c b/read-cache.c
--- a/read-cache.c
+++ b/read-cache.c
@@ -464,11 +464,15 @@ int read_cache(void)
errno = EBUSY;
if (active_cache)
- return error("more than one cachefile");
+ return active_nr;
+
errno = ENOENT;
fd = open(get_index_file(), O_RDONLY);
- if (fd < 0)
- return (errno == ENOENT) ? 0 : error("open failed");
+ if (fd < 0) {
+ if (errno == ENOENT)
+ return 0;
+ die("index file open failed (%s)", strerror(errno));
+ }
size = 0; // avoid gcc warning
map = MAP_FAILED;
@@ -480,7 +484,7 @@ int read_cache(void)
}
close(fd);
if (map == MAP_FAILED)
- return error("mmap failed");
+ die("index file mmap failed (%s)", strerror(errno));
hdr = map;
if (verify_hdr(hdr, size) < 0)
@@ -501,7 +505,7 @@ int read_cache(void)
unmap:
munmap(map, size);
errno = EINVAL;
- return error("verify header failed");
+ die("index file corrupt");
}
#define WRITE_BUFFER_SIZE 8192
diff --git a/update-index.c b/update-index.c
--- a/update-index.c
+++ b/update-index.c
@@ -13,7 +13,7 @@
* like "git-update-index *" and suddenly having all the object
* files be revision controlled.
*/
-static int allow_add = 0, allow_remove = 0, allow_replace = 0, not_new = 0, quiet = 0, info_only = 0;
+static int allow_add = 0, allow_remove = 0, allow_replace = 0, allow_unmerged = 0, not_new = 0, quiet = 0, info_only = 0;
static int force_remove;
/* Three functions to allow overloaded pointer return; see linux/err.h */
@@ -135,7 +135,7 @@ static struct cache_entry *refresh_entry
changed = ce_match_stat(ce, &st);
if (!changed)
- return ce;
+ return NULL;
if (ce_modified(ce, &st))
return ERR_PTR(-EINVAL);
@@ -156,16 +156,20 @@ static int refresh_cache(void)
struct cache_entry *ce, *new;
ce = active_cache[i];
if (ce_stage(ce)) {
- printf("%s: needs merge\n", ce->name);
- has_errors = 1;
while ((i < active_nr) &&
! strcmp(active_cache[i]->name, ce->name))
i++;
i--;
+ if (allow_unmerged)
+ continue;
+ printf("%s: needs merge\n", ce->name);
+ has_errors = 1;
continue;
}
new = refresh_entry(ce);
+ if (!new)
+ continue;
if (IS_ERR(new)) {
if (not_new && PTR_ERR(new) == -ENOENT)
continue;
@@ -335,6 +339,10 @@ int main(int argc, const char **argv)
allow_remove = 1;
continue;
}
+ if (!strcmp(path, "--unmerged")) {
+ allow_unmerged = 1;
+ continue;
+ }
if (!strcmp(path, "--refresh")) {
has_errors |= refresh_cache();
continue;
diff --git a/usage.c b/usage.c
--- a/usage.c
+++ b/usage.c
@@ -15,7 +15,7 @@ static void report(const char *prefix, c
void usage(const char *err)
{
fprintf(stderr, "usage: %s\n", err);
- exit(1);
+ exit(129);
}
void die(const char *err, ...)
@@ -25,7 +25,7 @@ void die(const char *err, ...)
va_start(params, err);
report("fatal: ", err, params);
va_end(params);
- exit(1);
+ exit(128);
}
int error(const char *err, ...)
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Honor extractor's umask in git-tar-tree.
2005-10-01 19:07 ` Honor extractor's umask in git-tar-tree Junio C Hamano
@ 2005-10-02 3:24 ` H. Peter Anvin
2005-10-02 9:55 ` Matthias Urlichs
0 siblings, 1 reply; 18+ messages in thread
From: H. Peter Anvin @ 2005-10-02 3:24 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Rene Scharfe, Linus Torvalds
Junio C Hamano wrote:
> The archive generated with git-tar-tree had 0755 and 0644 mode bits.
> This inconvenienced the extractor with umask 002 by robbing g+w bit
> unconditionally. Just write it out with loose permissions bits and
> let the umask of the extractor do its job.
I've thought that it would be nice if the files/directories were written
into the archive with 0666/0777 permissions by default, and then
extracted with the umask honoured. A special option then could be used
to add files with special permissions, like files in .ssh, which *have*
to be g-w or sshd will reject them.
-hpa
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Honor extractor's umask in git-tar-tree.
2005-10-02 3:24 ` H. Peter Anvin
@ 2005-10-02 9:55 ` Matthias Urlichs
2005-10-03 4:44 ` H. Peter Anvin
0 siblings, 1 reply; 18+ messages in thread
From: Matthias Urlichs @ 2005-10-02 9:55 UTC (permalink / raw)
To: git
Hi, H. Peter Anvin wrote:
> I've thought that it would be nice if the files/directories were written
> into the archive with 0666/0777 permissions by default, and then
> extracted with the umask honoured.
The git archive oesn't *have* permissions, just one "execute" bit.
> A special option then could be used
> to add files with special permissions, like files in .ssh, which *have*
> to be g-w or sshd will reject them.
>
I'd include a script in the archive which you'd run afterwards to fix
problems like this. IMHO, in most situations you'll need it anyway
(for instance, to re-start services).
--
Matthias Urlichs | {M:U} IT Design @ m-u-it.de | smurf@smurf.noris.de
Disclaimer: The quote was selected randomly. Really. | http://smurf.noris.de
- -
As I approached the intersection a sign suddenly appeared in a place
where no stop sign had ever appeared before. I was unable to stop in
time to avoid the accident. To avoid hitting the bumper of the car in
front, I struck the pedestrian.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Honor extractor's umask in git-tar-tree.
2005-10-02 9:55 ` Matthias Urlichs
@ 2005-10-03 4:44 ` H. Peter Anvin
2005-10-03 5:10 ` Junio C Hamano
0 siblings, 1 reply; 18+ messages in thread
From: H. Peter Anvin @ 2005-10-03 4:44 UTC (permalink / raw)
To: Matthias Urlichs; +Cc: git
Matthias Urlichs wrote:
> Hi, H. Peter Anvin wrote:
>
>>I've thought that it would be nice if the files/directories were written
>>into the archive with 0666/0777 permissions by default, and then
>>extracted with the umask honoured.
>
> The git archive oesn't *have* permissions, just one "execute" bit.
>
My point is that I believe it should. It has the bitfield for it, it
just doesn't use it at the moment.
>> A special option then could be used
>>to add files with special permissions, like files in .ssh, which *have*
>>to be g-w or sshd will reject them.
>
> I'd include a script in the archive which you'd run afterwards to fix
> problems like this. IMHO, in most situations you'll need it anyway
> (for instance, to re-start services).
That is true in some cases, but I highly disagree with the statement "most".
-hpa
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Honor extractor's umask in git-tar-tree.
2005-10-03 4:44 ` H. Peter Anvin
@ 2005-10-03 5:10 ` Junio C Hamano
2005-10-03 16:30 ` H. Peter Anvin
0 siblings, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2005-10-03 5:10 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: git
"H. Peter Anvin" <hpa@zytor.com> writes:
> My point is that I believe it should. It has the bitfield for it, it
> just doesn't use it at the moment.
It is a bit more complicated than that.
Long time ago, we used to store the full permission bits and
ended up storing files in 0644 and 0664 modes, depending on who
is writing the tree object. People with umask 022 checked out
from a tree that recorded blobs with 0664 bits and ended up
getting "mode changed" diff all the time, which was unacceptable
from the SCM point of view. We _could_ have really changed the
mode bits representation in the tree objects back then to have
type + executable bit, but to preserve backward compatibility,
we chose to keep the bitfield layout and changed the code to
treat 1006xx and 1007yy in older trees to be equivalent to
100644 and 100755. These days, for newly written tree objects,
above xx and yy 6-bit fields are "Must Be 4" and "Must Be 5"
fields, respectively, not bitfields to store arbitrary group and
other permission information. git-fsck-objects even complains
about them.
So in that sense, it does _not_ have the bitfield for it, and
obviously we cannot use what we do not have.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Honor extractor's umask in git-tar-tree.
2005-10-03 5:10 ` Junio C Hamano
@ 2005-10-03 16:30 ` H. Peter Anvin
2005-10-03 17:18 ` Junio C Hamano
2005-10-03 17:45 ` Linus Torvalds
0 siblings, 2 replies; 18+ messages in thread
From: H. Peter Anvin @ 2005-10-03 16:30 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
Junio C Hamano wrote:
> "H. Peter Anvin" <hpa@zytor.com> writes:
>
>
>>My point is that I believe it should. It has the bitfield for it, it
>>just doesn't use it at the moment.
>
>
> It is a bit more complicated than that.
>
> Long time ago, we used to store the full permission bits and
> ended up storing files in 0644 and 0664 modes, depending on who
> is writing the tree object. People with umask 022 checked out
> from a tree that recorded blobs with 0664 bits and ended up
> getting "mode changed" diff all the time, which was unacceptable
> from the SCM point of view. We _could_ have really changed the
> mode bits representation in the tree objects back then to have
> type + executable bit, but to preserve backward compatibility,
> we chose to keep the bitfield layout and changed the code to
> treat 1006xx and 1007yy in older trees to be equivalent to
> 100644 and 100755. These days, for newly written tree objects,
> above xx and yy 6-bit fields are "Must Be 4" and "Must Be 5"
> fields, respectively, not bitfields to store arbitrary group and
> other permission information. git-fsck-objects even complains
> about them.
>
> So in that sense, it does _not_ have the bitfield for it, and
> obviously we cannot use what we do not have.
>
Welcome to the wonderful world of evolving file formats.
As you stated above, we currently use this field in a very inefficient
manner, because of old mistakes. There are several ways to recover from
here, some of which are more complex than others.
In the case of git, there isn't just the requirement to maintain old
formats indefinitely (due to the cryptographic chain), but also that new
objects that are compatible with old format should be written in the old
format to maintain the aliasing properties. These are obstacles that
are perfectly possible to overcome, although it takes a bit of legwork.
If the old-format (with random write bits) is out of circulation --
which I can't tell for sure they it is, but Linus' kernel tree doesn't
seem to have any of these objects -- then the answer is very simple:
redefine this field _a posteori_ to be the mode ^ 022 (or perhaps more
sanely,
mode ^ (mode & 0200 ? 022 : 0)). Compatibility and contents is fully
preserved. No problem.
If there are still old-format trees in circulation and compatibility
with these very old trees need to be maintained, then it's a bit more
complicated, but literally just a bit. This data is already stored in
text form in the object store, so there aren't any funnies with
expanding it. For example, encode a leading + on the octal value if
this is a value with "I really mean it" permissions (and *only* those
values.) This is even readable by older versions of git, since they
will just blindly ignore the + sign.
-hpa
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Honor extractor's umask in git-tar-tree.
2005-10-03 16:30 ` H. Peter Anvin
@ 2005-10-03 17:18 ` Junio C Hamano
2005-10-03 17:28 ` H. Peter Anvin
2005-10-03 17:45 ` Linus Torvalds
1 sibling, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2005-10-03 17:18 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: git
"H. Peter Anvin" <hpa@zytor.com> writes:
> As you stated above, we currently use this field in a very inefficient
> manner, because of old mistakes. There are several ways to recover from
> here, some of which are more complex than others.
Solution for in-tree permission mode bits you outlined looked
fine (I'll have to re-read the part about "mode xor (mode &
022)..." part later, though).
For in-cache permission mode bits, we would probably need
something like this:
* git-update-index will pick up the filesystem bits with the
current semantics (i.e. look only at (mode & 0100) and
force 0644 or 0755) by default; --full-perm-bits option
would bypass this bits munging.
Once a file is added with --full-perm-bits, it might be
nice if index file remembers to pick up the full bits next
time git-update-index is run on the path. This could be
achieved by saying that anything stored in the cache with
non 100644, 100755 nor 120000 bits are such paths without
having to change the index file format.
* there are bunch of codes that assume 0644 and 0755 are the
norm but also know that there are ancient trees that have
0664 and 0775 and try to treat them equivalently. They need
to be selectively neutered; this applies to in-tree
permission bits as well.
git-read-tree will read permission mode bits from tree
object as-is. I.e. you will get 0644 and 0755 in cache from
the existing tree objects. When you check things out with
002 umask, you will get 0664 and 0775 on the filesystem. We
do not want to consider this "mode changed by the user".
git-update-index --refresh code should not be mode neutered
to prevent this. The same thing goes for diff. These
currently canonicalize mode bits by looking at (mode &
0100), but should be changed to do so only when index has
already the canonical mode bits, or something like that.
* git-write-tree and git-fsck-objects probably has code to
reject and correct abnormal mode bits. They need to be
neutered.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Honor extractor's umask in git-tar-tree.
2005-10-03 17:18 ` Junio C Hamano
@ 2005-10-03 17:28 ` H. Peter Anvin
0 siblings, 0 replies; 18+ messages in thread
From: H. Peter Anvin @ 2005-10-03 17:28 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
Junio C Hamano wrote:
>
> For in-cache permission mode bits, we would probably need
> something like this:
>
> * git-update-index will pick up the filesystem bits with the
> current semantics (i.e. look only at (mode & 0100) and
> force 0644 or 0755) by default; --full-perm-bits option
> would bypass this bits munging.
>
> Once a file is added with --full-perm-bits, it might be
> nice if index file remembers to pick up the full bits next
> time git-update-index is run on the path. This could be
> achieved by saying that anything stored in the cache with
> non 100644, 100755 nor 120000 bits are such paths without
> having to change the index file format.
>
One could also say that since oddball permissions are an exception, not
the rule, that one should use a "git-chmod" command to enter the
permissions in the cache. The correct answer is probably *both* that
and --full-permissions (or whatever) since they both probably apply to
different workflows.
-hpa
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Honor extractor's umask in git-tar-tree.
2005-10-03 16:30 ` H. Peter Anvin
2005-10-03 17:18 ` Junio C Hamano
@ 2005-10-03 17:45 ` Linus Torvalds
2005-10-03 18:05 ` H. Peter Anvin
1 sibling, 1 reply; 18+ messages in thread
From: Linus Torvalds @ 2005-10-03 17:45 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Junio C Hamano, git
On Mon, 3 Oct 2005, H. Peter Anvin wrote:
>
> If the old-format (with random write bits) is out of circulation -- which I
> can't tell for sure they it is, but Linus' kernel tree doesn't seem to have
> any of these objects
Oh, it does.
Run "git-fsck-cache --full --strict", and you'll get several trees that
the strict checker marks as bad. I think it's mostly all one entry, namely
arch/i386/kernel/vsyscall-note.S being marked 0664.
Git itself has even more of them - the kernel actually has fewer, because
most work was done with a consistent umask of 022 (ie mostly mine), and by
the time others started using git actively, we'd already changed the git
rules.
However, there's nothing that says that we couldn't use one more bit in
the "mode" flag to just say "this is an _exact_ mode, please preserve it".
A kind of "sticky mode" for git. We've got _bits_ plenty: it's an ASCII
text-mode representation in the trees (infinite bits), and even in the
index it's a 32-bit thing that we only use 12 bits of (9 bits for
permissions, 3 bits for the sparsely represented directory/symlink/regular
file)
We'd have to be a bit careful to preserve that bit when doing an index
refresh, but it's really not very hard. The hardest part is actually doing
so for directories, since we don't keep the directories in the index at
_all_.
But the fact is, it wouldn't solve the git-tar-tree thing. We can
_represent_ exact masks, but we don't _want_ to, because normally it just
leads to horrible problems with different people having different umasks.
So in order to avoid having mode change merges, we'd _still_ have to make
the current "0666/0777 + umask" be the normal one, and you'd use this
"exact mode" thing only for very special cases (ie for backing up your
home directory or similar, _not: for a distributed SCM).
As to tar: I think the current
if (S_ISDIR(mode) || S_ISREG(mode))
mode |= (mode & 0100) ? 0777 : 0666;
is wrong. It makes things world-writable by default, and that's just
dangerous. "tar" normally won't apply umask when untarring (there's a flag
for it, but I have never ever used it myself, and I doubt anybody else
really does either - it's called "--no-same-permissions" in GNU tar).
I think a "0775" or "0664" might be acceptable (an umask of 002 is at
least _normal_), but I suspect 0755/0644 is really better. Doing a simple
chmod -R +w
afterwards is better (and takes umask into account) than a "chmod -R o-w",
since the latter leaves the tree writable for a while.
Ie default permissions are better off being too strict than too lax. Basic
security.
Of course, if we were to add the "exact mode" bit, then git-tar-tree
should obviously honor that for any files that have that bit set.
Linus
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Honor extractor's umask in git-tar-tree.
2005-10-03 17:45 ` Linus Torvalds
@ 2005-10-03 18:05 ` H. Peter Anvin
2005-10-03 18:18 ` Linus Torvalds
0 siblings, 1 reply; 18+ messages in thread
From: H. Peter Anvin @ 2005-10-03 18:05 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Junio C Hamano, git
Linus Torvalds wrote:
>
> As to tar: I think the current
>
> if (S_ISDIR(mode) || S_ISREG(mode))
> mode |= (mode & 0100) ? 0777 : 0666;
>
> is wrong. It makes things world-writable by default, and that's just
> dangerous.
That's standard in the Unix world, though; of course, the user's umask
shouldn't be set to zero unless things are in very special
circumstances. In the case of tar, the umask is applied on extraction
unless the user explicitly specifies -p.
-hpa
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Honor extractor's umask in git-tar-tree.
2005-10-03 18:05 ` H. Peter Anvin
@ 2005-10-03 18:18 ` Linus Torvalds
2005-10-03 18:32 ` H. Peter Anvin
0 siblings, 1 reply; 18+ messages in thread
From: Linus Torvalds @ 2005-10-03 18:18 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Junio C Hamano, git
On Mon, 3 Oct 2005, H. Peter Anvin wrote:
>
> Linus Torvalds wrote:
> >
> > As to tar: I think the current
> >
> > if (S_ISDIR(mode) || S_ISREG(mode))
> > mode |= (mode & 0100) ? 0777 : 0666;
> >
> > is wrong. It makes things world-writable by default, and that's just
> > dangerous.
>
> That's standard in the Unix world, though; of course, the user's umask
> shouldn't be set to zero unless things are in very special circumstances. In
> the case of tar, the umask is applied on extraction unless the user explicitly
> specifies -p.
Is it? The only place umask is mentioned in the man-page is
--no-same-permissions
apply user?s umask when extracting files instead of recorded
permissions
but if tar really does honor umask, then hey, that 0777/0666 is fine.
(Ahh, googling a bit more, it appears that "-p" is default for root, which
explains why you'd need the "anti-flag").
Linus
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Honor extractor's umask in git-tar-tree.
2005-10-03 18:18 ` Linus Torvalds
@ 2005-10-03 18:32 ` H. Peter Anvin
0 siblings, 0 replies; 18+ messages in thread
From: H. Peter Anvin @ 2005-10-03 18:32 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Junio C Hamano, git
Linus Torvalds wrote:
>
> Is it? The only place umask is mentioned in the man-page is
>
> --no-same-permissions
> apply user?s umask when extracting files instead of recorded
> permissions
>
> but if tar really does honor umask, then hey, that 0777/0666 is fine.
>
> (Ahh, googling a bit more, it appears that "-p" is default for root, which
> explains why you'd need the "anti-flag").
>
Yeah, that assymetry is rather unfortunate.
-hpa
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2005-10-03 18:32 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-09-30 16:03 Destructive side-effect of "cg-status" Wolfgang Denk
2005-10-01 10:24 ` Martin Langhoff
2005-10-01 16:41 ` Linus Torvalds
2005-10-01 18:14 ` Junio C Hamano
2005-10-01 19:07 ` Honor extractor's umask in git-tar-tree Junio C Hamano
2005-10-02 3:24 ` H. Peter Anvin
2005-10-02 9:55 ` Matthias Urlichs
2005-10-03 4:44 ` H. Peter Anvin
2005-10-03 5:10 ` Junio C Hamano
2005-10-03 16:30 ` H. Peter Anvin
2005-10-03 17:18 ` Junio C Hamano
2005-10-03 17:28 ` H. Peter Anvin
2005-10-03 17:45 ` Linus Torvalds
2005-10-03 18:05 ` H. Peter Anvin
2005-10-03 18:18 ` Linus Torvalds
2005-10-03 18:32 ` H. Peter Anvin
2005-10-01 19:42 ` Destructive side-effect of "cg-status" Wolfgang Denk
2005-10-01 20:24 ` Linus Torvalds
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).