Re: [PATCH] create_ref_entry(): move check_refname_format() call to callers

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Michael Haggerty <mhagger@alum.mit.edu>
To: Jeff King <peff@peff.net>
Cc: Junio C Hamano <gitster@pobox.com>, git@vger.kernel.org
Subject: Re: [PATCH] create_ref_entry(): move check_refname_format() call to callers
Date: Mon, 30 Apr 2012 18:18:53 +0200	[thread overview]
Message-ID: <4F9EBB6D.3090900@alum.mit.edu> (raw)
In-Reply-To: <20120429115831.GC24254@sigill.intra.peff.net>

On 04/29/2012 01:58 PM, Jeff King wrote:
> On Sun, Apr 29, 2012 at 08:18:08AM +0200, mhagger@alum.mit.edu wrote:
>
>> I will work on providing more infrastructure for checking refnames at
>> varying levels of strictness, but I don't know enough about the code
>> paths to be able to find the places where the strictness levels need
>> tweaking.
>>
>> For this to work, the various callers of check_refname_format() will
>> have to be able to influence the level of strictness that they want to
>> enforce.  This patch is one trivial step in that direction.
>
> It seems like the create_ref_entry code paths should _always_ just be
> issuing warnings, as they are about reading existing refs, no? The die()
> side should only come when we are writing refs.

There are not two but three classes of refnames:

1. Fully legal refnames (according to the rules of check-ref-format).

2. Refnames that might cause parsing trouble in some circumstances but 
could otherwise be tolerated (with a warning) in the hope that the user 
can delete them before they cause further confusion.  These include all 
illegal refnames that aren't covered by (3).

3. Refnames that are so pathological that we don't want to let them into 
our code at all, under any circumstances.  This category, I think, 
includes refnames that include "/../" (because they could cause our code 
to walk up the filesystem) or LF (because if written to a packed-refs 
file they would make it unreadable) and perhaps "//" (because they would 
confuse the loose reference code and the hierarchical reference cache 
code).  And depending on how much you trust shell scripts to do quoting 
correctly, other patterns might also be risky.

I would like to be able to distinguish between (2) and (3) before 
deciding what to do in any specific cases.  References in category (2) 
can probably be accepted (with a warning) in old data but we should not 
allow the user to create new ones.  References in category (3) are more 
critical; I see three options for dealing with them: die(), emit a 
warning then drop them on the floor, or emit a warning but handle them 
somehow (for example, by URL-encoding them).

Treating category (3) the same as category (2) was more or less the 
status quo before the changes, but I think it is dangerous, especially 
when dealing with references that might come from remote sources (do you 
disagree?).

Regarding create_ref_entry(), there are three callers:

* read_packed_refs(): Only used to read references from local
   packed-refs files.  Seems uncritical in terms of security, so for now
   I suppose we could change this caller to emit a warning but use the
   reference anyway.  (It would not be a good idea to emit a warning but
   *not* use the reference, because the next time the packed-refs file
   is rewritten the reference would be lost.)

* get_ref_dir(): Only used to read loose references from the local
   filesystem.  Seems uncritical in terms of security, so for now
   I suppose we could change this caller to emit a warning but use the
   reference anyway.  Alternately, we could emit a warning but not use
   the reference; this would not result in any data loss because nobody
   would have a reason to remove the loose reference file.

* add_packed_ref(): Used by write_remote_refs() to insert references
   from a cloned repository into the local packed-refs file.  Here I
   think we have to be paranoid about accepting refnames in category
   (3).  For example, it might be reasonable to emit a warning
   mentioning the illegal reference name *and the SHA1 that it referred
   to* then to drop it on the floor.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

next prev parent reply	other threads:[~2012-04-30 16:19 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-29  6:18 [PATCH] create_ref_entry(): move check_refname_format() call to callers mhagger
2012-04-29 11:58 ` Jeff King
2012-04-30  6:15   ` Junio C Hamano
2012-04-30 16:18   ` Michael Haggerty [this message]
2012-04-30 17:14     ` Junio C Hamano
2012-04-30 20:29       ` Michael Haggerty
2012-04-30 21:11         ` Junio C Hamano
2012-05-02 15:38           ` Michael Haggerty

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F9EBB6D.3090900@alum.mit.edu \
    --to=mhagger@alum.mit.edu \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).