git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* how to use git with unreachable paths (namespaces, proc)?
@ 2024-10-07  0:59 benaryorg
  2024-10-07 21:48 ` brian m. carlson
  0 siblings, 1 reply; 2+ messages in thread
From: benaryorg @ 2024-10-07  0:59 UTC (permalink / raw)
  To: git


[-- Attachment #1.1.1: Type: text/plain, Size: 3687 bytes --]

After trying to convince search engines to talk to me about git and unreachable paths or directories and symlinks, I've given up given how difficult it is to search for this.

The issue I'm running into is when git on Linux is faced with unreachable paths, like a symlink that points at a directory that is otherwise unreachable from the filesystem, or when the current working directory resides in a place that is not a descendant from / (symlinks the likes of /dev/fd and /proc/.../fd).
My use case is trying to access a directory via an open file descriptor from within a Linux namespace (created using bubblewrap, basically a sandbox without my home directory).
Something along these lines should work:

     bwrap --unshare-all --share-net --ro-bind /bin /bin --ro-bind /usr /usr --ro-bind /lib /lib --ro-bind /etc /etc --ro-bind /sbin /sbin --dev /dev --proc /proc --dir ~ --chdir ~ $SHELL 3< ~/Documents/my_git_dir # (add more ro-binds as necessary, also assumes your home is not in any of the ro-binds)

This should drop you in your usual shell but without your home directory, however it should also give you access to the git directory provided in the last bit via /proc/$$/fd/3, /dev/fd/3, or similar.
(I am aware that this effectively breaks the security guarantees of the sandbox since accessing /proc/$$/fd/3/../../.. works, however security wasn't the point of the sandbox to begin with)
Now if you try to cd to /proc/$$/fd/3 and run something like `git status` you'll be greeted by `fatal: Unable to read current working directory: No such file or directory` (at least on my machine).
Cloning the repository seems to work, however if you'd want to clone it without duplicating the data you may be inclined to use `git clone --shared /proc/$$/fd/3` which err's out with a lot of `error: unable to normalize alternate object path: /proc/2/fd/3/.git/objects`.
Similarly `cd -P /proc/$$/fd/3 && git clone --shared . ~/test` will fail with `fatal: unable to get current working directory: No such file or directory`.

Now as far as I can tell this is because git tries to resolve the symlinks in the path (/proc/$$/fd/3 is provided by the kernel as a symlink to ~/Documents/my_git_dir even though it doesn't exist in that hierarchy, yet you can access it, please don't ask me about the specifics of that) and then access them, which of course here doesn't work out.
For the shared clone I understand why a path is necessary, considering that it needs to add the references to the new repository, however there seems to be no fallback for a scenario like this one.
strace tells me that the call to `getcwd(2)` yields the path (outside the namespace) prefixed with "(unreachable)", which would explain why things fall apart (as this path is neither absolute nor accessible), or in other cases that git tries to walk the tree up (and by that I mean the tree outside the namespace) until it hits root, which I assume is its way to try and canonicalize the path, which here won't work either.

So my question is: is this a bug or intentional behaviour (given how particular git is around symlinks I can imagine either), and how can I work around this and make git perform basic operations such as status, archive, or alternatively get clone with `--shared` to work so that git can at least operate on a separate repository without all the prior objects being duplicated?
Also note that this is absolutely not important to me, so if this is just something that git cannot handle in its current implementation that'd be fine by me, I just figured the chances that someone will reply to this with "why don't you use --ignore-path-reachability" are not negligible.

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 929 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 236 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: how to use git with unreachable paths (namespaces, proc)?
  2024-10-07  0:59 how to use git with unreachable paths (namespaces, proc)? benaryorg
@ 2024-10-07 21:48 ` brian m. carlson
  0 siblings, 0 replies; 2+ messages in thread
From: brian m. carlson @ 2024-10-07 21:48 UTC (permalink / raw)
  To: benaryorg; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 2522 bytes --]

On 2024-10-07 at 00:59:43, benaryorg wrote:
> Now as far as I can tell this is because git tries to resolve the
> symlinks in the path (/proc/$$/fd/3 is provided by the kernel as a
> symlink to ~/Documents/my_git_dir even though it doesn't exist in that
> hierarchy, yet you can access it, please don't ask me about the
> specifics of that) and then access them, which of course here doesn't
> work out. For the shared clone I understand why a path is necessary,
> considering that it needs to add the references to the new repository,
> however there seems to be no fallback for a scenario like this one.
> strace tells me that the call to `getcwd(2)` yields the path (outside
> the namespace) prefixed with "(unreachable)", which would explain why
> things fall apart (as this path is neither absolute nor accessible),
> or in other cases that git tries to walk the tree up (and by that I
> mean the tree outside the namespace) until it hits root, which I
> assume is its way to try and canonicalize the path, which here won't
> work either.
> 
> So my question is: is this a bug or intentional behaviour (given how
> particular git is around symlinks I can imagine either), and how can I
> work around this and make git perform basic operations such as status,
> archive, or alternatively get clone with `--shared` to work so that
> git can at least operate on a separate repository without all the
> prior objects being duplicated?
> Also note that this is absolutely not important to me, so if this is
> just something that git cannot handle in its current implementation
> that'd be fine by me, I just figured the chances that someone will
> reply to this with "why don't you use --ignore-path-reachability" are
> not negligible.

Git (and all compatible implementations) always canonicalize the path to
the `.git` directory (or bare repository) and the working tree (if any).
In your case, that won't work, because `getcwd(2)` returns a path that
doesn't work with `realpath(3)`, so Git is always going to fail.

The path canonicalization is required because otherwise it's very easy
to accidentally break the repository, and some old versions of Git had
problems when accessed from a path that contained a symlink[0], so it's
unlikely we'll add an option to skip it.

[0] Since I copy all of my home directory across when I get a new
machine, I actually have a broken repository from this era still today.
-- 
brian m. carlson (they/them or he/him)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2024-10-07 21:48 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-07  0:59 how to use git with unreachable paths (namespaces, proc)? benaryorg
2024-10-07 21:48 ` brian m. carlson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).