From: Brandon Williams <bmwill@google.com>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 2/2] submodule: munge paths to submodule git directories
Date: Tue, 14 Aug 2018 11:04:06 -0700 [thread overview]
Message-ID: <20180814180406.GA86804@google.com> (raw)
In-Reply-To: <20180809212602.GA11342@sigill.intra.peff.net>
On 08/09, Jeff King wrote:
> On Wed, Aug 08, 2018 at 03:33:23PM -0700, Brandon Williams wrote:
>
> > Commit 0383bbb901 (submodule-config: verify submodule names as paths,
> > 2018-04-30) introduced some checks to ensure that submodule names don't
> > include directory traversal components (e.g. "../").
> >
> > This addresses the vulnerability identified in 0383bbb901 but the root
> > cause is that we use submodule names to construct paths to the
> > submodule's git directory. What we really should do is munge the
> > submodule name before using it to construct a path.
> >
> > Teach "submodule_name_to_gitdir()" to munge a submodule's name (by url
> > encoding it) before using it to build a path to the submodule's gitdir.
>
> I like this approach very much, and I think using url encoding is much
> better than an opaque hash (purely because it makes debugging and
> inspection saner).
>
> Two thoughts, though:
>
> > + modules_len = buf->len;
> > strbuf_addstr(buf, submodule_name);
> > +
> > + /*
> > + * If the submodule gitdir already exists using the old-fashioned
> > + * location (which uses the submodule name as-is, without munging it)
> > + * then return that.
> > + */
> > + if (!access(buf->buf, F_OK))
> > + return;
>
> I think this backwards-compatibility is necessary to avoid pain. But
> until it goes away, I don't think this is helping the vulnerability from
> 0383bbb901. Because there the issue was that the submodule name pointed
> back into the working tree, so this access() would find the untrusted
> working tree code and say "ah, an old-fashioned name!".
>
> In theory a fresh clone could set a config option for "I only speak
> use new-style modules". And there could even be a conversion program
> that moves the modules as appropriate, fixes up the .git files in the
> working tree, and then sets that config.
>
> In fact, I think that config option _could_ be done by bumping
> core.repositoryformatversion and then setting extensions.submodulenames
> to "url" or something. Then you could never run into the confusing case
> where you have a clone done by a new version of git (using new-style
> names), but using an old-style version gets confused because it can't
> find the module directories (instead, it would barf and say "I don't
> know about that extension").
>
> I don't know if any of that is worth it, though. We already fixed the
> problem from 0383bbb901. There may be a _different_ "break out of the
> modules directory" vulnerability, but since we disallow ".." it's hard
> to see what it would be (the best I could come up with is maybe pointing
> one module into the interior of another module, but I think you'd have
> to trouble overwriting anything useful).
>
> And while an old-style version of Git being confused might be annoying,
> I suspect that bumping the repository version would be even _more_
> annoying, because it would hit every command, not just ones that try to
> touch those submodules.
Oh I know that this doesn't help with that vulnerability. As you've
said we fix it and now disallow ".." at the submodule-config level so
really this path is simply about using what we get out of
submodule-config in a more sane manor.
>
> > diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
> > index 2c2c97e144..963693332c 100755
> > --- a/t/t7400-submodule-basic.sh
> > +++ b/t/t7400-submodule-basic.sh
> > @@ -933,7 +933,7 @@ test_expect_success 'recursive relative submodules stay relative' '
> > cd clone2 &&
> > git submodule update --init --recursive &&
> > echo "gitdir: ../.git/modules/sub3" >./sub3/.git_expect &&
> > - echo "gitdir: ../../../.git/modules/sub3/modules/dirdir/subsub" >./sub3/dirdir/subsub/.git_expect
> > + echo "gitdir: ../../../.git/modules/sub3/modules/dirdir%2fsubsub" >./sub3/dirdir/subsub/.git_expect
>
> One interesting thing about url-encoding is that it's not one-to-one.
> This case could also be %2F, which is a different file (on a
> case-sensitive filesystem). I think "%20" and "+" are similarly
> interchangeable.
>
> If we were decoding the filenames, that's fine. The round-trip is
> lossless.
>
> But that's not quite how the new code behaves. We encode the input and
> then check to see if it matches an encoding we previously performed. So
> if our urlencode routines ever change, this will subtly break.
>
> I don't know how much it's worth caring about. We're not that likely to
> change the routines ourself (though certainly a third-party
> implementation would need to know our exact url-encoding decisions).
This is exactly the reason why I wanted to get some opinions on what the
best thing to do here would be. I _think_ the best thing would probably
be to write a specific routine to do the conversion, and it wouldn't
even have to be all that complex. Basically I'm just interested in
converting '/' characters so that things no longer behave like
nested directories.
>
> Some possible actions:
>
> 0. Do nothing, and cross our fingers. ;)
>
> 1. Don't use strbuf_addstr_urlencode(), but rather our own munging
> function which we know will remain stable (or alternatively, a flag
> to strbuf_addstr_urlencode to get the consistent behavior).
>
> 2. Make sure we have tests which cover this, so at least somebody
> changing the urlencode decisions will see a breakage. Your test here
> covers the upper/lowercase one, but we might want one that covers
> "+". (There may be more ambiguous cases, but those are the ones I
> know about).
>
> 3. Rather than check for the existence of names, decode what's actually
> in the modules/ directory to create an in-memory index of names.
>
> I hesitate to suggest that, because it's obviously way more
> complicated, and may perform worse if you have a lot of modules
> (since you have to readdir() and decode the whole directory just to
> look up one module).
>
> But I think it also gives a more elegant solution to the
> backwards-compatibility problem, since we could recognize both new
> and old-style names. There's some ambiguity (e.g., is "foo%2fbar"
> "foo/bar", or did somebody really have a name with a percent in
> it?),. but in theory you could respect either name (giving
> preference to new-style in case of a conflict).
>
> And I think the result would be immune to any directory-escape
> vulnerabilities, because we'd always start with what actually exists
> in $GIT_DIR/modules/, which we know _we_ will have written.
>
> Again, I'm not sure if it's worth the effort, but I thought I'd
> throw it out there.
>
> -Peff
--
Brandon Williams
next prev parent reply other threads:[~2018-08-14 18:04 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-07 23:06 [RFC] submodule: munge paths to submodule git directories Brandon Williams
2018-08-07 23:25 ` Jonathan Nieder
2018-08-08 0:14 ` Junio C Hamano
2018-08-08 22:33 ` [PATCH 0/2] munge submodule names Brandon Williams
2018-08-08 22:33 ` [PATCH 1/2] submodule: create helper to build paths to submodule gitdirs Brandon Williams
2018-08-08 23:21 ` Stefan Beller
2018-08-09 0:45 ` Brandon Williams
2018-08-10 21:27 ` Junio C Hamano
2018-08-10 21:45 ` Brandon Williams
2018-08-08 22:33 ` [PATCH 2/2] submodule: munge paths to submodule git directories Brandon Williams
2018-08-09 21:26 ` Jeff King
2018-08-14 18:04 ` Brandon Williams [this message]
2018-08-14 18:57 ` Jonathan Nieder
2018-08-14 21:08 ` Stefan Beller
2018-08-14 21:12 ` Jonathan Nieder
2018-08-14 22:34 ` Stefan Beller
2018-08-16 2:34 ` Jonathan Nieder
2018-08-16 2:39 ` Stefan Beller
2018-08-16 2:47 ` Jonathan Nieder
2018-08-16 17:34 ` Brandon Williams
2018-08-16 18:19 ` [PATCH] submodule: add config for where gitdirs are located Brandon Williams
2018-08-20 22:03 ` Junio C Hamano
2018-08-16 15:07 ` [PATCH 2/2] submodule: munge paths to submodule git directories Junio C Hamano
2018-08-14 18:58 ` Jeff King
2018-08-28 21:35 ` Stefan Beller
2018-08-29 5:25 ` Jeff King
2018-08-29 18:10 ` Stefan Beller
2018-08-29 21:03 ` Jeff King
2018-08-29 21:10 ` Stefan Beller
2018-08-29 21:18 ` Jonathan Nieder
2018-08-29 21:27 ` Stefan Beller
2018-08-29 21:30 ` Jeff King
2018-08-29 21:09 ` Jonathan Nieder
2018-08-29 21:14 ` Stefan Beller
2018-08-29 21:25 ` Brandon Williams
2018-08-29 21:32 ` Jeff King
2018-08-16 0:19 ` Aaron Schrab
2019-01-15 1:25 ` [RFC] " Jonathan Nieder
2019-01-17 17:32 ` Jeff King
2019-01-17 17:57 ` Stefan Beller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180814180406.GA86804@google.com \
--to=bmwill@google.com \
--cc=git@vger.kernel.org \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.