git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michael Haggerty <mhagger@alum.mit.edu>
To: Junio C Hamano <gitster@pobox.com>
Cc: Jeff King <peff@peff.net>, git@vger.kernel.org
Subject: Re: [PATCH] create_ref_entry(): move check_refname_format() call to callers
Date: Mon, 30 Apr 2012 22:29:01 +0200	[thread overview]
Message-ID: <4F9EF60D.8030301@alum.mit.edu> (raw)
In-Reply-To: <7vd36pgn0e.fsf@alter.siamese.dyndns.org>

On 04/30/2012 07:14 PM, Junio C Hamano wrote:
> Michael Haggerty<mhagger@alum.mit.edu>  writes:
>
>> There are not two but three classes of refnames:
>>
>> 1. Fully legal refnames (according to the rules of check-ref-format).
>>
>> 2. Refnames that might cause parsing trouble in some circumstances but
>> could otherwise be tolerated (with a warning) in the hope that the
>> user can delete them before they cause further confusion.  These
>> include all illegal refnames that aren't covered by (3).
>>
>> 3. Refnames that are so pathological that we don't want to let them
>> into our code at all, under any circumstances.  This category, I
>> think, includes refnames that include "/../" (because they could cause
>> our code to walk up the filesystem) or LF (because if written to a
>> packed-refs file they would make it unreadable) and perhaps "//"
>> (because they would confuse the loose reference code and the
>> hierarchical reference cache code).  And depending on how much you
>> trust shell scripts to do quoting correctly, other patterns might also
>> be risky.
>>[...]
>> Treating category (3) the same as category (2) was more or less the
>> status quo before the changes, but I think it is dangerous, especially
>> when dealing with references that might come from remote sources (do
>> you disagree?).
>
> I actually do. [...example involving class (2) refname omitted...]
>
> What kind of "danger" do you have in mind?

I agree with you about how class (2) refnames can be handled, namely on 
a best-effort basis, without paranoia.  The "danger" that I worry about 
is if we treat class (3) refnames the same non-paranoid way that we 
treat class (2) refnames.  To be sure, my worries are at the level of 
"it seems like a lot of data paths have to be trusted and I have not 
personally verified that there is no danger and I have a vivid 
imagination" rather than "I have figured out how to implement an exploit".

For example, have all of the following code paths been audited to make 
sure that they cannot introduce class (3) refnames into a repository 
(including via symbolic refs with class (3) targets) even in the face of 
a malicious remote?  Can we (and do we want to) rely on this level of 
vigilance being sustained in the future?

* git clone (for all transport types)?

* git push (for all transport types)?

* git fast-import?

* git am?

* All kinds of remote helpers (git svn, etc.) and import scripts (e.g., 
the malicious refs might come directly from a Subversion repository)?

...no doubt there are others.

What if a malicious repository is copied using rsync (i.e., not "git 
clone rsync:..." but straight rsync)?  Granted, the victim would have 
worse problems if he didn't delete any version of $GIT/config that was 
contained in the copy before using it.

Banning class (3) references at lib level would provide a second level 
of defense against errors in the outer level.  Allowing "class 3" 
references at the lib level, by comparison, means putting a lot of 
confidence in the auditing of all of the ref-creating code paths.

It's true that a loose reference cannot (by construction) contain 
constructs like "/../" or "//".  And packed refs are not stored via 
filesystem paths, so such constructs are not a direct hazard.  But there 
are lots of ways that a packed ref can be turned into a loose ref, and 
the process of creating a loose ref from a class (3) packed ref could 
cause trouble.

I propose that I implement a REFNAME_NONSTRICT flag which relaxes the 
normal rules as follows:

* ".." is only forbidden if it is a complete refname component

* All characters (including control characters, '?', '*', and '[') are 
allowed, except for NUL, SP, LF, ':', and '\'.

* The character sequence "@{" is *not* prohibited.

The reason I suggest excluding ':' and '\' is because of their 
significance in DOS pathnames; let me know if you think this is unnecessary.

Do you want these changes in master or in maint (or even older)?

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

  reply	other threads:[~2012-04-30 20:29 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-29  6:18 [PATCH] create_ref_entry(): move check_refname_format() call to callers mhagger
2012-04-29 11:58 ` Jeff King
2012-04-30  6:15   ` Junio C Hamano
2012-04-30 16:18   ` Michael Haggerty
2012-04-30 17:14     ` Junio C Hamano
2012-04-30 20:29       ` Michael Haggerty [this message]
2012-04-30 21:11         ` Junio C Hamano
2012-05-02 15:38           ` Michael Haggerty

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F9EF60D.8030301@alum.mit.edu \
    --to=mhagger@alum.mit.edu \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).