git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Reference has invalid format: check maybe a bit to harsh?
@ 2011-10-31 19:14 Peter Oberndorfer
  2011-10-31 19:54 ` Junio C Hamano
  0 siblings, 1 reply; 4+ messages in thread
From: Peter Oberndorfer @ 2011-10-31 19:14 UTC (permalink / raw)
  To: git; +Cc: Michael Haggerty

Hi,

i am using the next branch for testing and i noticed the following:
about any git command i execute in a certain repo dies with
fatal: Reference has invalid format: 
'refs/patches/obd_development/blah:_various_improvements_remote_debugging'

This is probably caused by
dce4bab6567de7c458b334e029e3dedcab5f2648 add_ref(): verify that the refname is 
formatted correctly

The invalid refs(about 30, loose and packed) containing a ':' were created by 
stgit a long time ago(Dec 2006)

Personally i do not care too much, i patched my git to not die at this point 
but to only display a error.
-> The invalid refs are not accessible, but the rest of the repo still works.

But i'm just wondering if dieing when seeing a single invalid ref might be a 
bit too harsh since no git tools can be used anymore on this repo at all.


Small side note:
It seems t1402-check-ref-format.sh contains not test
for the invalid ref char ':' yet.
(i do not know if it is tested somewhere else...)

Thanks,
Greetings Peter

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Reference has invalid format: check maybe a bit to harsh?
  2011-10-31 19:14 Reference has invalid format: check maybe a bit to harsh? Peter Oberndorfer
@ 2011-10-31 19:54 ` Junio C Hamano
  2011-10-31 20:19   ` Junio C Hamano
  0 siblings, 1 reply; 4+ messages in thread
From: Junio C Hamano @ 2011-10-31 19:54 UTC (permalink / raw)
  To: Peter Oberndorfer; +Cc: git, Michael Haggerty

Peter Oberndorfer <kumbayo84@arcor.de> writes:

> The invalid refs(about 30, loose and packed) containing a ':' were created by 
> stgit a long time ago(Dec 2006)

I think even back then colon was one of the forbidden letters in a
refname. Of course, it is entirely possible that broken third-party tools
may have created such file that is not a ref in .git/refs hierarchy by
hand, and we may not be carefully rejecting such broken refs for a long
time.

    ... Goes and asks "git blame" ...

03feddd (git-check-ref-format: reject funny ref names., 2005-10-13)
started disallowing control characters and other characters that are used
for range operators and the separator between LHS and RHS of refspecs,
further tightened by 6828399 (Forbid pattern maching characters in
refnames., 2005-12-15).

> But i'm just wondering if dieing when seeing a single invalid ref might be a 
> bit too harsh since no git tools can be used anymore on this repo at all.

I agree that we would want to give users an escape hatch.  That is, if we
can make something like this to work:

    c=$(git rev-parse --force refs/patches/obd_development/blah:_vari...)
    git update-ref refs/patches/obd_development/blah--various-improvements $c

I think we would be in a good shape.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Reference has invalid format: check maybe a bit to harsh?
  2011-10-31 19:54 ` Junio C Hamano
@ 2011-10-31 20:19   ` Junio C Hamano
  2011-11-01  9:59     ` Michael Haggerty
  0 siblings, 1 reply; 4+ messages in thread
From: Junio C Hamano @ 2011-10-31 20:19 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Peter Oberndorfer, git

Junio C Hamano <gitster@pobox.com> writes:

> I agree that we would want to give users an escape hatch.  That is, if we
> can make something like this to work:
>
>     c=$(git rev-parse --force refs/patches/obd_development/blah:_vari...)
>     git update-ref refs/patches/obd_development/blah--various-improvements $c

Also we would need to be able to say

    git update-ref -d refs/patches/obd_development/blah:_vari...

to get rid of the offending one.

> I think we would be in a good shape.

Having said all that, I think we should in general loosen the checks done
on the reading side a lot more. The "checks" themselves should stay, can
give loud warnings, and even can error out when appropriate, but in an
operation that is necessary to recover from _existing_ breakage (like the
one in this thread, a file with a colon in its name in .git/refs/), the
ability to read and to remove is essential for recovery.

I vaguely recall we had to apply a fix in the same spirit to loosen
reading side after the offending topic was merged to 'master' during this
cycle about $GIT_DIR/config not possibly being a ref getting warned, or
something.

Michael, what do you think?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Reference has invalid format: check maybe a bit to harsh?
  2011-10-31 20:19   ` Junio C Hamano
@ 2011-11-01  9:59     ` Michael Haggerty
  0 siblings, 0 replies; 4+ messages in thread
From: Michael Haggerty @ 2011-11-01  9:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Peter Oberndorfer, git

On 10/31/2011 09:19 PM, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
>> I agree that we would want to give users an escape hatch.  That is, if we
>> can make something like this to work:
>>
>>     c=$(git rev-parse --force refs/patches/obd_development/blah:_vari...)
>>     git update-ref refs/patches/obd_development/blah--various-improvements $c
> 
> Also we would need to be able to say
> 
>     git update-ref -d refs/patches/obd_development/blah:_vari...
> 
> to get rid of the offending one.
> 
>> I think we would be in a good shape.
> 
> Having said all that, I think we should in general loosen the checks done
> on the reading side a lot more. The "checks" themselves should stay, can
> give loud warnings, and even can error out when appropriate, but in an
> operation that is necessary to recover from _existing_ breakage (like the
> one in this thread, a file with a colon in its name in .git/refs/), the
> ability to read and to remove is essential for recovery.
> 
> I vaguely recall we had to apply a fix in the same spirit to loosen
> reading side after the offending topic was merged to 'master' during this
> cycle about $GIT_DIR/config not possibly being a ref getting warned, or
> something.
> 
> Michael, what do you think?

Supporting invalid reference names some places, but not others, and this
perhaps (as in the case of ":") platform-dependent would be a big can of
worms (as can be seen by my attempt to summarize the situation, a few
paragraphs below).

I see the situation as a conflict between security and reliability on
the one hand and backwards-compatibility and/or the ability to recover
from old mistakes on the other hand.  For example, reference names like
the following are problematic in earlier git releases:

* "refs/heads/foo/../../../../etc/passwd" -- here be dragons

* "refs/heads/foo.lock/bar" -- screws up the locking of a reference
called "refs/heads/foo"

* "refs/heads/prn:/waste/my/paper" -- is questionable on Windows, I believe

* "refs/heads//foo" -- would be normalized to "refs/heads/foo" in
certain circumstances but not others

Perhaps instead of arguing about exactly what command arguments are
treated strictly vs laxly and attempting to half-support invalid
reference names, we could solve 98% of the problem with a couple of
localized measures:

1. Something in the git_snpath() callchain could prevent paths referring
to files outside of $GIT_DIR from being generated (by normalizing the
paths, stripping out "." and "..", etc).  I suppose that this change
would remove a lot of potential security issues at a stroke, and make it
less important for refname handling to be paranoid.

2. Invalid references could be detected and fixed via some mode of "git
fsck".  This would then be the only codepath that has to handle invalid
reference names, and would take that burden off of the rest of the code.
 "git fsck" could sanitize the names of invalid references and move them
to some kind of "lost+found" namespace.  (Disadvantage: it could be
impractical to run "git fsck" on a remote repository to which one
doesn't have filesystem access.)


If one would want to plunge in with a complicated solution, things will
get messy.  Here is an attempt at an exhaustive summary of the changes
since v1.7.7 that affect how strictly refnames are checked for validity,
and ideas for how one could work around problems of backwards
compatibility in the particular cases.

1. read_packed_refs() -- the old behavior was to accept *anything* found
in the packed-refs file; however, invalid references could not be worked
with reliably.  Now checks that packed-refs refnames are valid and
die()s if not.

   a. The old behavior could be restored for now.
   a'. The old behavior could be restored, except that renames with
leading, trailing, or duplicate slashes could silently be normalized.
This could lead to collisions between unnormalized and normalized names
that were previously distinct.  Such a collision is harmless if the two
symbols have the same value, but currently causes a fatal error if they
have distinct values.
   b. Invalid references could be silently ignored (this would cause
them to be silently discarded at the next pack-refs).
   c. Invalid references could be ignored with a warning (the warning
could include the SHA1).  This would cause a lot of warnings to be
output until the next pack-refs or the repository is fixed some other way.

2. get_ref_dir() (reads loose refs) -- the old behavior was to accept
anything that was found as a filesystem path under "$GIT_DIR/refs"
except paths with components starting with "." or ending with ".lock".
Now checks that loose-ref refnames are valid and die()s if not.

   a. The old behavior could be restored for now.
   b. Invalid references could be silently ignored (this would cause
them to be carried around forever in the local repository, including
after a pack-refs, but omitted when the local repository is cloned).
   c. Invalid references could be ignored with a warning (the warning
could include the SHA1).  This would cause a lot of warnings to be
output until the repository is fixed somehow.

3. add_extra_ref() -- old behavior was to accept anything.  But since
extra refs are generated locally and the names are not used, this
behavior shouldn't be a problem and could be restored for now.

4. check_refname_format() -- now disallows any refname component than
ends with ".lock" (previously only the last component was checked).  Now
disallows DEL character in refnames, in agreement with the old
specification.

5. resolve_ref() -- previously passed its argument to git_snpath() and
tried to open the path without any verification whatsoever.  Treated the
contents of symbolic refs similarly.  Now checks the validity of
refnames at both steps.

   a. The old behavior could be restored, but this would be very
questionable given how refs like "foo/../../../etc/passed" might be handled.
   b. There could be some laxer level of checking of security-relevant
issues without enforcing all of the refname rules.

6. The infamous change that caused files that looked like loose
references but have invalid *contents* to cause a warning to be emitted
(strictly speaking, this doesn't affect refnames but reference contents).

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-11-01 10:00 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-31 19:14 Reference has invalid format: check maybe a bit to harsh? Peter Oberndorfer
2011-10-31 19:54 ` Junio C Hamano
2011-10-31 20:19   ` Junio C Hamano
2011-11-01  9:59     ` Michael Haggerty

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).