From: Michael Haggerty <mhagger@alum.mit.edu>
To: Junio C Hamano <gitster@pobox.com>
Cc: Jeff King <peff@peff.net>, git@vger.kernel.org
Subject: Re: [PATCH] create_ref_entry(): move check_refname_format() call to callers
Date: Mon, 30 Apr 2012 22:29:01 +0200 [thread overview]
Message-ID: <4F9EF60D.8030301@alum.mit.edu> (raw)
In-Reply-To: <7vd36pgn0e.fsf@alter.siamese.dyndns.org>
On 04/30/2012 07:14 PM, Junio C Hamano wrote:
> Michael Haggerty<mhagger@alum.mit.edu> writes:
>
>> There are not two but three classes of refnames:
>>
>> 1. Fully legal refnames (according to the rules of check-ref-format).
>>
>> 2. Refnames that might cause parsing trouble in some circumstances but
>> could otherwise be tolerated (with a warning) in the hope that the
>> user can delete them before they cause further confusion. These
>> include all illegal refnames that aren't covered by (3).
>>
>> 3. Refnames that are so pathological that we don't want to let them
>> into our code at all, under any circumstances. This category, I
>> think, includes refnames that include "/../" (because they could cause
>> our code to walk up the filesystem) or LF (because if written to a
>> packed-refs file they would make it unreadable) and perhaps "//"
>> (because they would confuse the loose reference code and the
>> hierarchical reference cache code). And depending on how much you
>> trust shell scripts to do quoting correctly, other patterns might also
>> be risky.
>>[...]
>> Treating category (3) the same as category (2) was more or less the
>> status quo before the changes, but I think it is dangerous, especially
>> when dealing with references that might come from remote sources (do
>> you disagree?).
>
> I actually do. [...example involving class (2) refname omitted...]
>
> What kind of "danger" do you have in mind?
I agree with you about how class (2) refnames can be handled, namely on
a best-effort basis, without paranoia. The "danger" that I worry about
is if we treat class (3) refnames the same non-paranoid way that we
treat class (2) refnames. To be sure, my worries are at the level of
"it seems like a lot of data paths have to be trusted and I have not
personally verified that there is no danger and I have a vivid
imagination" rather than "I have figured out how to implement an exploit".
For example, have all of the following code paths been audited to make
sure that they cannot introduce class (3) refnames into a repository
(including via symbolic refs with class (3) targets) even in the face of
a malicious remote? Can we (and do we want to) rely on this level of
vigilance being sustained in the future?
* git clone (for all transport types)?
* git push (for all transport types)?
* git fast-import?
* git am?
* All kinds of remote helpers (git svn, etc.) and import scripts (e.g.,
the malicious refs might come directly from a Subversion repository)?
...no doubt there are others.
What if a malicious repository is copied using rsync (i.e., not "git
clone rsync:..." but straight rsync)? Granted, the victim would have
worse problems if he didn't delete any version of $GIT/config that was
contained in the copy before using it.
Banning class (3) references at lib level would provide a second level
of defense against errors in the outer level. Allowing "class 3"
references at the lib level, by comparison, means putting a lot of
confidence in the auditing of all of the ref-creating code paths.
It's true that a loose reference cannot (by construction) contain
constructs like "/../" or "//". And packed refs are not stored via
filesystem paths, so such constructs are not a direct hazard. But there
are lots of ways that a packed ref can be turned into a loose ref, and
the process of creating a loose ref from a class (3) packed ref could
cause trouble.
I propose that I implement a REFNAME_NONSTRICT flag which relaxes the
normal rules as follows:
* ".." is only forbidden if it is a complete refname component
* All characters (including control characters, '?', '*', and '[') are
allowed, except for NUL, SP, LF, ':', and '\'.
* The character sequence "@{" is *not* prohibited.
The reason I suggest excluding ':' and '\' is because of their
significance in DOS pathnames; let me know if you think this is unnecessary.
Do you want these changes in master or in maint (or even older)?
Michael
--
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/
next prev parent reply other threads:[~2012-04-30 20:29 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-29 6:18 [PATCH] create_ref_entry(): move check_refname_format() call to callers mhagger
2012-04-29 11:58 ` Jeff King
2012-04-30 6:15 ` Junio C Hamano
2012-04-30 16:18 ` Michael Haggerty
2012-04-30 17:14 ` Junio C Hamano
2012-04-30 20:29 ` Michael Haggerty [this message]
2012-04-30 21:11 ` Junio C Hamano
2012-05-02 15:38 ` Michael Haggerty
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F9EF60D.8030301@alum.mit.edu \
--to=mhagger@alum.mit.edu \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).