From: Michael Haggerty <mhagger@alum.mit.edu>
To: git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>,
cmn@elego.de, A Large Angry SCM <gitzilla@gmail.com>,
Michael Haggerty <mhagger@alum.mit.edu>
Subject: [PATCH v2 0/7] Improved infrastructure for refname normalization
Date: Sat, 10 Sep 2011 08:50:36 +0200 [thread overview]
Message-ID: <1315637443-14012-1-git-send-email-mhagger@alum.mit.edu> (raw)
Patch series re-roll:
The main difference is that I incorporated feedback from Junio that it
should be made more difficult to use unnormalized references. The
mechanism that I selected is a bit different than discussed in the
mailing list; I changed normalize_refname() to accept unnormalized
refnames *only* if the dst argument is non-NULL. I changed
check_ref_format() to reject unnormalized refnames but implemented a
transitional function check_ref_format_unsafe() that has the old
behavior and changed callers to use the _unsafe function to avoid
changing their behavior.
As a prerequisite to storing references caches hierarchically (itself
needed for performance reasons), here is a patch series to help us get
refname normalization under control.
The problem is that some UI accepts unnormalized reference names (like
"/foo/bar" or "foo///bar" instead of "foo/bar") and passes them on to
library routines without normalizing them. The library, on the other
hand, assumes that the refnames are normalized. Sometimes (mostly in
the case of loose references) unnormalized refnames happen to work,
but in other cases (like packed references or when looking up refnames
in the cache) they silently fail. Given that refnames are sometimes
treated as path names, there is a chance that some security-relevant
bugs are lurking in this area, if not in git proper then in scripts
that interact with git.
This patch series adds the following tools for dealing with refnames
and their normalization (without actually doing much to fix the
problem; see below):
* Fix check_ref_format() to make it easier and reliable to specify
which types of refnames are allowed in a particular situation
(multi-level vs. one-level, with vs. without a refspec-like "*").
This function only accepts normalized refnames.
* Add a function normalize_refname() that accepts unnormalized
refnames, checks the format, and outputs a normalized version.
* Add a function check_ref_format_unsafe() that has the old behavior.
Change callers to use this function until they can be fixed.
* Add options to "git check-ref-format" to give scripts access to
these facilities (and to allow them to be tested in the test suite).
* Forbid ".lock" at the end of any refname component, as directories
with such names can conflict with attempts to create lock files for
other refnames.
I suggest the following policy for handling unnormalized refnames more
consistently:
Unnormalized refnames should only be accepted at the UI level and
should be normalized before use. This can be done using code like
int len = strlen(arg);
char *normalized_refname = xmalloc(len);
if (normalize_refname(normalized_refname, len, arg, flags))
die("invalid refname '%s'", arg);
/* From now on, use normalized_refname. */
Refnames coming from other sources, such as from a remote repository,
should be checked that they are in the correct *normalized* format,
like so:
if (check_ref_format(refname, flags))
die("invalid refname '%s'", refname);
Refnames from the local repository (e.g., from the packed references
file) should also be checked that they are in the correct normalized
format, though this policy could be debated if there are performance
concerns.
Refnames probably do not need to be checked at the entrance to the
refs.{c,h} library functions because callers are responsible for not
passing invalid or unnormalized refnames in. However, some assert()s
would probably be justified, especially during the transition while we
are fixing broken callers.
If there is agreement about this policy, I would be happy to write it
up (presumably in Documentation/technical/api-ref.txt, maybe also
incorporating the content from
Documentation/technical/api-ref-iteration.txt).
I do not yet have enough global overview of the code to know which
callers need fixing, but once these tools are in place the callers can
be fixed incrementally.
Michael Haggerty (7):
t1402: add some more tests
Change bad_ref_char() to return a boolean value
git check-ref-format: add options --allow-onelevel and
--refspec-pattern
Change check_ref_format() to take a flags argument
Add a library function normalize_refname()
Do not allow ".lock" at the end of any refname component
Add tools to avoid the use of unnormalized refnames.
Documentation/git-check-ref-format.txt | 44 ++++++++--
builtin/check-ref-format.c | 76 +++++++++---------
builtin/checkout.c | 2 +-
builtin/fetch-pack.c | 2 +-
builtin/receive-pack.c | 2 +-
builtin/replace.c | 2 +-
builtin/show-ref.c | 2 +-
builtin/tag.c | 4 +-
connect.c | 2 +-
environment.c | 2 +-
fast-import.c | 7 +--
notes-merge.c | 5 +-
pack-refs.c | 2 +-
refs.c | 139 +++++++++++++++++++------------
refs.h | 46 +++++++++-
remote.c | 54 ++++---------
sha1_name.c | 4 +-
t/t1402-check-ref-format.sh | 96 ++++++++++++++++++++--
transport.c | 17 +---
walker.c | 2 +-
20 files changed, 325 insertions(+), 185 deletions(-)
--
1.7.6.8.gd2879
next reply other threads:[~2011-09-10 6:51 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-09-10 6:50 Michael Haggerty [this message]
2011-09-10 6:50 ` [PATCH v2 1/7] t1402: add some more tests Michael Haggerty
2011-09-10 6:50 ` [PATCH v2 2/7] Change bad_ref_char() to return a boolean value Michael Haggerty
2011-09-10 6:50 ` [PATCH v2 3/7] git check-ref-format: add options --allow-onelevel and --refspec-pattern Michael Haggerty
2011-09-10 6:50 ` [PATCH v2 4/7] Change check_ref_format() to take a flags argument Michael Haggerty
2011-09-10 6:50 ` [PATCH v2 5/7] Add a library function normalize_refname() Michael Haggerty
2011-09-10 6:50 ` [PATCH v2 6/7] Do not allow ".lock" at the end of any refname component Michael Haggerty
2011-09-10 6:50 ` [PATCH v2 7/7] Add tools to avoid the use of unnormalized refnames Michael Haggerty
2011-09-12 4:28 ` [PATCH v2 0/7] Improved infrastructure for refname normalization Junio C Hamano
2011-09-12 15:11 ` Michael Haggerty
2011-09-12 16:44 ` Junio C Hamano
2011-09-13 4:16 ` Michael Haggerty
2011-09-13 17:43 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1315637443-14012-1-git-send-email-mhagger@alum.mit.edu \
--to=mhagger@alum.mit.edu \
--cc=cmn@elego.de \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=gitzilla@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).