Limited operations in unsafe repositories

public inbox for git@vger.kernel.org
 help / color / mirror / Atom feed

* Limited operations in unsafe repositories
@ 2024-01-07 19:40 brian m. carlson
  2024-01-10 12:05 ` Jeff King
  0 siblings, 1 reply; 7+ messages in thread
From: brian m. carlson @ 2024-01-07 19:40 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 1494 bytes --]

Right now, any time we try to set up a repository in that's owned by
another user, we die.  While good for security, this is inconvenient in
a bunch of ways.

For example, when Git LFS wants to push data locally, it needs to know
where the `.git` directory is because it pushes the objects into
`.git/lfs`.  Thus, we want to do `git rev-parse --absolute-git-dir` to
find the remote Git directory, but we can't do that if the repository is
owned by a different user.

That issue also affects the Git LFS SSH transfer server (Scutiger),
which also needs to read the configuration (to set the umask
appropriately for `core.sharedrepository`).

I had looked at sending a patch to make `git rev-parse` operate in a
special mode where it's impossible to invoke any binaries at all, but
unfortunately, `get_superproject_working_tree` invokes binaries, so
that's not possible.  (If anyone is interested in picking this up, there
is a start on it, failing many tests, in the `rev-parse-safe-directory`
on my GitHub remote.)

I guess I'm looking for us to provide some basic functionality that is
guaranteed to work in this case, including `git rev-parse` and `git
config -l`.  I don't think it's useful for every program that wants to
work with Git to need to implement its own repository discovery and
config parsing, and those are essential needs for tooling that needs to
work with untrusted repositories.
-- 
brian m. carlson (he/him or they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Limited operations in unsafe repositories
  2024-01-07 19:40 Limited operations in unsafe repositories brian m. carlson
@ 2024-01-10 12:05 ` Jeff King
  2024-01-10 23:34   ` brian m. carlson
  0 siblings, 1 reply; 7+ messages in thread
From: Jeff King @ 2024-01-10 12:05 UTC (permalink / raw)
  To: brian m. carlson; +Cc: git

On Sun, Jan 07, 2024 at 07:40:20PM +0000, brian m. carlson wrote:

> I had looked at sending a patch to make `git rev-parse` operate in a
> special mode where it's impossible to invoke any binaries at all, but
> unfortunately, `get_superproject_working_tree` invokes binaries, so
> that's not possible.  (If anyone is interested in picking this up, there
> is a start on it, failing many tests, in the `rev-parse-safe-directory`
> on my GitHub remote.)
> 
> I guess I'm looking for us to provide some basic functionality that is
> guaranteed to work in this case, including `git rev-parse` and `git
> config -l`.  I don't think it's useful for every program that wants to
> work with Git to need to implement its own repository discovery and
> config parsing, and those are essential needs for tooling that needs to
> work with untrusted repositories.

If I understand correctly, you want to have some very limited code paths
that we know are OK to run even in an untrusted repo. My concern there
would be that it's easy for "basic functionality" to accidentally call
into less-limited code, which suddenly becomes a vulnerability.

My thinking is to flip that around: run all code, but put protection in
the spots that do unsafe things, like loading config or examining
hooks. I.e., a patch like this:

diff --git a/config.c b/config.c
index 9ff6ae1cb9..c7bbd6bdda 100644
--- a/config.c
+++ b/config.c
@@ -2026,7 +2026,7 @@ static int do_git_config_sequence(const struct config_options *opts,
 	if (!opts->git_dir != !opts->commondir)
 		BUG("only one of commondir and git_dir is non-NULL");

-	if (opts->commondir) {
+	if (opts->commondir && is_safe_repository())
 		repo_config = mkpathdup("%s/config", opts->commondir);
 		worktree_config = mkpathdup("%s/config.worktree", opts->git_dir);
 	} else {
diff --git a/hook.c b/hook.c
index f6306d72b3..4fcfd82dc5 100644
--- a/hook.c
+++ b/hook.c
@@ -12,6 +12,9 @@ const char *find_hook(const char *name)
 {
 	static struct strbuf path = STRBUF_INIT;

+	if (!is_safe_repository())
+		return NULL;
+
 	strbuf_reset(&path);
 	strbuf_git_path(&path, "hooks/%s", name);
 	if (access(path.buf, X_OK) < 0) {

where is_safe_repository() is something like:

  - if git_env_bool("GIT_ASSUME_SAFE", 1) is true, return true
    immediately

  - otherwise, see if we are matched by safe.directory

And then running:

  git --assume-unsafe rev-parse ...

would set that variable to "0".

And then most functionality would just work, modulo trusting repo hooks
and config. I mentioned "loading config" earlier, but of course that
patch is just touching the usual "load all config sources" code. We'd
still allow parsing .git/config for repo discovery, and something like:

  git --assume-unsafe config -l --local

would still work if the caller knew they would be careful with the
output.

The downside, of course, is that we might fail to include other
dangerous spots. Those two are the big ones, I think, but would we want
to prevent people from pointing to /dev/mem in info/alternates? I dunno.
It certainly is more protection than we offer today, but maybe saying
"this is safe" would produce a false sense of security.

I do think something like git-prompt.sh could benefit here (it could use
--assume-unsafe for all invocations so you don't get surprised just by
chdir-ing into a downloaded tarball).

-Peff

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: Limited operations in unsafe repositories
  2024-01-10 12:05 ` Jeff King
@ 2024-01-10 23:34   ` brian m. carlson
  2024-01-11  0:04     ` Junio C Hamano
  2024-01-11  7:01     ` Jeff King
  0 siblings, 2 replies; 7+ messages in thread
From: brian m. carlson @ 2024-01-10 23:34 UTC (permalink / raw)
  To: Jeff King; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 906 bytes --]

On 2024-01-10 at 12:05:31, Jeff King wrote:
> My thinking is to flip that around: run all code, but put protection in
> the spots that do unsafe things, like loading config or examining
> hooks. I.e., a patch like this:

I think that's much what I had intended to do with not invoking binaries
at all, except that it was limited to rev-parse.  I wonder if perhaps we
could do something similar if we had the `--assume-unsafe` argument you
proposed, except that we would only allow the `git` binary and always
pass that argument to it in such a case.

I don't think reading config is intrinsically unsafe; it's more of what
we do with it, which is spawning external processes, that's the problem.
I suppose an argument could be made for injecting terminal sequences or
such, though.  Hooks, obviously, are definitely unsafe.
-- 
brian m. carlson (he/him or they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Limited operations in unsafe repositories
  2024-01-10 23:34   ` brian m. carlson
@ 2024-01-11  0:04     ` Junio C Hamano
  2024-01-11  7:01     ` Jeff King
  1 sibling, 0 replies; 7+ messages in thread
From: Junio C Hamano @ 2024-01-11  0:04 UTC (permalink / raw)
  To: brian m. carlson; +Cc: Jeff King, git

"brian m. carlson" <sandals@crustytoothpaste.net> writes:

> On 2024-01-10 at 12:05:31, Jeff King wrote:
>> My thinking is to flip that around: run all code, but put protection in
>> the spots that do unsafe things, like loading config or examining
>> hooks. I.e., a patch like this:
>
> I think that's much what I had intended to do with not invoking binaries
> at all, except that it was limited to rev-parse.  I wonder if perhaps we
> could do something similar if we had the `--assume-unsafe` argument you
> proposed, except that we would only allow the `git` binary and always
> pass that argument to it in such a case.
>
> I don't think reading config is intrinsically unsafe; it's more of what
> we do with it, which is spawning external processes, that's the problem.
> I suppose an argument could be made for injecting terminal sequences or
> such, though.  Hooks, obviously, are definitely unsafe.

Sure.  And we allow the location of hook programs to be specified as
configuration variable values, which would make the config even more
dangerous X-<.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Limited operations in unsafe repositories
  2024-01-10 23:34   ` brian m. carlson
  2024-01-11  0:04     ` Junio C Hamano
@ 2024-01-11  7:01     ` Jeff King
  2024-01-11  7:17       ` Patrick Steinhardt
  1 sibling, 1 reply; 7+ messages in thread
From: Jeff King @ 2024-01-11  7:01 UTC (permalink / raw)
  To: brian m. carlson; +Cc: git

On Wed, Jan 10, 2024 at 11:34:04PM +0000, brian m. carlson wrote:

> On 2024-01-10 at 12:05:31, Jeff King wrote:
> > My thinking is to flip that around: run all code, but put protection in
> > the spots that do unsafe things, like loading config or examining
> > hooks. I.e., a patch like this:
> 
> I think that's much what I had intended to do with not invoking binaries
> at all, except that it was limited to rev-parse.  I wonder if perhaps we
> could do something similar if we had the `--assume-unsafe` argument you
> proposed, except that we would only allow the `git` binary and always
> pass that argument to it in such a case.

I'm not sure what you mean by "invoking binaries". I had assumed that
meant just running other Git sub-processes. But if "--assume-unsafe" is
just setting an environment variable, they'd automatically be protected.

> I don't think reading config is intrinsically unsafe; it's more of what
> we do with it, which is spawning external processes, that's the problem.
> I suppose an argument could be made for injecting terminal sequences or
> such, though.  Hooks, obviously, are definitely unsafe.

Right, it's not config itself that's unsafe; it's that many options are.
We could try to annotate them to say "it is OK to parse core.eol but not
core.pager", presumably with an allow-known-good approach (since so many
ard bad!). But that feels like an ongoing maintenance headache, and an
easy way to make a mistake (your mention of terminal sequences makes me
assume you're thinking of "color.diff.*", etc). A rule like "we do not
read repo-level config at all" seems easier to explain (to me, anyway).

-Peff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Limited operations in unsafe repositories
  2024-01-11  7:01     ` Jeff King
@ 2024-01-11  7:17       ` Patrick Steinhardt
  2024-01-11  7:30         ` Jeff King
  0 siblings, 1 reply; 7+ messages in thread
From: Patrick Steinhardt @ 2024-01-11  7:17 UTC (permalink / raw)
  To: Jeff King; +Cc: brian m. carlson, git

[-- Attachment #1: Type: text/plain, Size: 2283 bytes --]

On Thu, Jan 11, 2024 at 02:01:14AM -0500, Jeff King wrote:
> On Wed, Jan 10, 2024 at 11:34:04PM +0000, brian m. carlson wrote:
> 
> > On 2024-01-10 at 12:05:31, Jeff King wrote:
> > > My thinking is to flip that around: run all code, but put protection in
> > > the spots that do unsafe things, like loading config or examining
> > > hooks. I.e., a patch like this:
> > 
> > I think that's much what I had intended to do with not invoking binaries
> > at all, except that it was limited to rev-parse.  I wonder if perhaps we
> > could do something similar if we had the `--assume-unsafe` argument you
> > proposed, except that we would only allow the `git` binary and always
> > pass that argument to it in such a case.
> 
> I'm not sure what you mean by "invoking binaries". I had assumed that
> meant just running other Git sub-processes. But if "--assume-unsafe" is
> just setting an environment variable, they'd automatically be protected.
> 
> > I don't think reading config is intrinsically unsafe; it's more of what
> > we do with it, which is spawning external processes, that's the problem.
> > I suppose an argument could be made for injecting terminal sequences or
> > such, though.  Hooks, obviously, are definitely unsafe.
> 
> Right, it's not config itself that's unsafe; it's that many options are.
> We could try to annotate them to say "it is OK to parse core.eol but not
> core.pager", presumably with an allow-known-good approach (since so many
> ard bad!). But that feels like an ongoing maintenance headache, and an
> easy way to make a mistake (your mention of terminal sequences makes me
> assume you're thinking of "color.diff.*", etc). A rule like "we do not
> read repo-level config at all" seems easier to explain (to me, anyway).

With the exemption of the repository format, I assume? We have to parse
things like `core.repositoryFormatVersion` and extensions in order to
figure out how a repository has to be accessed. So I agree that we
should not partition config based on safeness, which is going to be a
headache as you rightly point out. But we can partition based on whether
or not config is required in order to access the repository, where the
set of relevant config keys is a whole lot smaller.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Limited operations in unsafe repositories
  2024-01-11  7:17       ` Patrick Steinhardt
@ 2024-01-11  7:30         ` Jeff King
  0 siblings, 0 replies; 7+ messages in thread
From: Jeff King @ 2024-01-11  7:30 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: brian m. carlson, git

On Thu, Jan 11, 2024 at 08:17:14AM +0100, Patrick Steinhardt wrote:

> > Right, it's not config itself that's unsafe; it's that many options are.
> > We could try to annotate them to say "it is OK to parse core.eol but not
> > core.pager", presumably with an allow-known-good approach (since so many
> > ard bad!). But that feels like an ongoing maintenance headache, and an
> > easy way to make a mistake (your mention of terminal sequences makes me
> > assume you're thinking of "color.diff.*", etc). A rule like "we do not
> > read repo-level config at all" seems easier to explain (to me, anyway).
> 
> With the exemption of the repository format, I assume? We have to parse
> things like `core.repositoryFormatVersion` and extensions in order to
> figure out how a repository has to be accessed. So I agree that we
> should not partition config based on safeness, which is going to be a
> headache as you rightly point out. But we can partition based on whether
> or not config is required in order to access the repository, where the
> set of relevant config keys is a whole lot smaller.

Right. See the pseudo-patch I posted earlier. I think we just want to
touch do_git_config_sequence(), which leaves repo discovery and stuff
like "git config --local" working.

-Peff

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-01-11  7:30 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-07 19:40 Limited operations in unsafe repositories brian m. carlson
2024-01-10 12:05 ` Jeff King
2024-01-10 23:34   ` brian m. carlson
2024-01-11  0:04     ` Junio C Hamano
2024-01-11  7:01     ` Jeff King
2024-01-11  7:17       ` Patrick Steinhardt
2024-01-11  7:30         ` Jeff King

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox