* [RFC] archive: behavior of --prefix with absolute or parent path components
@ 2026-04-07 16:21 Pushkar Singh
2026-04-07 19:24 ` Jeff King
2026-04-08 16:00 ` [PATCH] archive: document --prefix handling of absolute and parent paths Pushkar Singh
0 siblings, 2 replies; 6+ messages in thread
From: Pushkar Singh @ 2026-04-07 16:21 UTC (permalink / raw)
To: git; +Cc: gitster, peff
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1243 bytes --]
Hi,
While experimenting with "git archive", I noticed some behavior around
the --prefix option that might be worth clarifying.
Currently, --prefix accepts values such as absolute paths or ones with ..,
e.g.:
git archive --prefix=/ HEAD > out.tar
git archive --prefix=//// HEAD > out.tar
git archive --prefix=../../ HEAD > out.tar
Upon listing the archive contents (e.g., tar -tf), you get entries like:
/a.txt
////a.txt
../../a.txt
In such cases, tar emits warnings like:
"Removing leading '/' from member names"
"Removing leading '../' from member names"
This suggests that Git passes the prefix through as-is, relying on
downstream tools to sanitize potentially unsafe paths.
From a user perspective, I was wondering:
- Is this behavior intentional (i.e., leaving validation to archive
consumers)?
- Would it be worth documenting this explicitly?
- Or should there be any normalization or validation at the Git level?
I understand that Git generally avoids enforcing policy decisions in
such cases, but I wanted to confirm whether this behavior is intentional.
I’d appreciate any thoughts on this :-)
Thanks,
Pushkar
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [RFC] archive: behavior of --prefix with absolute or parent path components 2026-04-07 16:21 [RFC] archive: behavior of --prefix with absolute or parent path components Pushkar Singh @ 2026-04-07 19:24 ` Jeff King 2026-04-07 19:57 ` Junio C Hamano 2026-04-07 22:24 ` brian m. carlson 2026-04-08 16:00 ` [PATCH] archive: document --prefix handling of absolute and parent paths Pushkar Singh 1 sibling, 2 replies; 6+ messages in thread From: Jeff King @ 2026-04-07 19:24 UTC (permalink / raw) To: Pushkar Singh; +Cc: git, gitster On Tue, Apr 07, 2026 at 04:21:01PM +0000, Pushkar Singh wrote: > Currently, --prefix accepts values such as absolute paths or ones with .., > e.g.: > git archive --prefix=/ HEAD > out.tar > git archive --prefix=//// HEAD > out.tar > git archive --prefix=../../ HEAD > out.tar > > Upon listing the archive contents (e.g., tar -tf), you get entries like: > /a.txt > ////a.txt > ../../a.txt > > In such cases, tar emits warnings like: > "Removing leading '/' from member names" > "Removing leading '../' from member names" Yes, but note that with "-P" tar will happily allow those paths. They _can_ be useful, if you know what you are doing, but they aren't necessarily safe when coming from untrusted sources. We can also generate zip files, but I think most unzip implementations have similar restrictions (info-zip does, with "-:" to override). In theory we could support other formats, but after 20 years I don't think anybody has bothered to do so. Cpio, anyone? :) Though speaking of cpio (the command, not the format), it will happily list and extract the paths above from a tar input without any extra option (it has an option to restrict, but unlike tar, it defaults to off). > From a user perspective, I was wondering: > - Is this behavior intentional (i.e., leaving validation to archive > consumers)? > - Would it be worth documenting this explicitly? > - Or should there be any normalization or validation at the Git level? > > I understand that Git generally avoids enforcing policy decisions in > such cases, but I wanted to confirm whether this behavior is intentional. I don't recall it ever being discussed. Of the three you mentioned, "../" and leading "/" are potentially useful, so I don't think we'd want to disallow them entirely. At least some tar implementations require "-P" on the generating side to avoid mistakes, so we could follow that path. It may be considered a regression by anybody who is using the feature currently, though. The "////" is meaningless AFAICT, and could be replaced with a single slash. But I think it's also mostly harmless, as the reading side (well, the kernel) will equate "foo/////file" and "foo/file". I don't know if there are systems where that would not be the case. So...yeah. I guess we can document it more explicitly. Since you seem to be the first to ask about it, it does not seem like a common question. But if we can clarify the behavior without making the current docs harder to read, I don't see a problem in doing so. -Peff ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC] archive: behavior of --prefix with absolute or parent path components 2026-04-07 19:24 ` Jeff King @ 2026-04-07 19:57 ` Junio C Hamano 2026-04-07 22:24 ` brian m. carlson 1 sibling, 0 replies; 6+ messages in thread From: Junio C Hamano @ 2026-04-07 19:57 UTC (permalink / raw) To: Jeff King; +Cc: Pushkar Singh, git Jeff King <peff@peff.net> writes: >> In such cases, tar emits warnings like: >> "Removing leading '/' from member names" >> "Removing leading '../' from member names" > > Yes, but note that with "-P" tar will happily allow those paths. They > _can_ be useful, if you know what you are doing, but they aren't > necessarily safe when coming from untrusted sources. > > We can also generate zip files, but I think most unzip implementations > have similar restrictions (info-zip does, with "-:" to override). > > In theory we could support other formats, but after 20 years I don't > think anybody has bothered to do so. Cpio, anyone? :) > > Though speaking of cpio (the command, not the format), it will happily > list and extract the paths above from a tar input without any extra > option (it has an option to restrict, but unlike tar, it defaults to > off). > >> From a user perspective, I was wondering: >> - Is this behavior intentional (i.e., leaving validation to archive >> consumers)? >> - Would it be worth documenting this explicitly? >> - Or should there be any normalization or validation at the Git level? >> >> I understand that Git generally avoids enforcing policy decisions in >> such cases, but I wanted to confirm whether this behavior is intentional. > > I don't recall it ever being discussed. Of the three you mentioned, > "../" and leading "/" are potentially useful, so I don't think we'd want > to disallow them entirely. At least some tar implementations require > "-P" on the generating side to avoid mistakes, so we could follow that > path. It may be considered a regression by anybody who is using the > feature currently, though. Thanks. I was writing almost exactly the same message ;-) > The "////" is meaningless AFAICT, and could be replaced with a single > slash. But I think it's also mostly harmless, as the reading side (well, > the kernel) will equate "foo/////file" and "foo/file". I don't know if > there are systems where that would not be the case. > > So...yeah. I guess we can document it more explicitly. Since you seem to > be the first to ask about it, it does not seem like a common question. > But if we can clarify the behavior without making the current docs > harder to read, I don't see a problem in doing so. Yup, in other words, "Patches welcome". ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC] archive: behavior of --prefix with absolute or parent path components 2026-04-07 19:24 ` Jeff King 2026-04-07 19:57 ` Junio C Hamano @ 2026-04-07 22:24 ` brian m. carlson 1 sibling, 0 replies; 6+ messages in thread From: brian m. carlson @ 2026-04-07 22:24 UTC (permalink / raw) To: Jeff King; +Cc: Pushkar Singh, git, gitster [-- Attachment #1: Type: text/plain, Size: 2197 bytes --] On 2026-04-07 at 19:24:54, Jeff King wrote: > Yes, but note that with "-P" tar will happily allow those paths. They > _can_ be useful, if you know what you are doing, but they aren't > necessarily safe when coming from untrusted sources. > > We can also generate zip files, but I think most unzip implementations > have similar restrictions (info-zip does, with "-:" to override). I suspect there are people using this with `/` because they want to deploy files to places like `/etc`. We've actually had requests for the ability to have multiple roots in a repository so that people can do this kind of thing, so I'm certain there are people finding _some_ way to do it, even if not with this exact approach. In conjunction with a tool like mtree(1) to adjust ownership and permissions, this could be useful. > In theory we could support other formats, but after 20 years I don't > think anybody has bothered to do so. Cpio, anyone? :) cpio doesn't have the long filename support that our pax (tar) archives have, so I wouldn't recommend adding it. The only place I still see people use it is initramfs images for Linux. > I don't recall it ever being discussed. Of the three you mentioned, > "../" and leading "/" are potentially useful, so I don't think we'd want > to disallow them entirely. At least some tar implementations require > "-P" on the generating side to avoid mistakes, so we could follow that > path. It may be considered a regression by anybody who is using the > feature currently, though. > > The "////" is meaningless AFAICT, and could be replaced with a single > slash. But I think it's also mostly harmless, as the reading side (well, > the kernel) will equate "foo/////file" and "foo/file". I don't know if > there are systems where that would not be the case. Technically, POSIX allows `//` to be different than `/`, I believe, although I'm not aware of anyone outside of Windows (and maybe Interix) where that has any special meaning. If you have such a system, it could be useful to provide that as well as `/`. I agree that it's more likely a typo, though. -- brian m. carlson (they/them) Toronto, Ontario, CA [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 325 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] archive: document --prefix handling of absolute and parent paths 2026-04-07 16:21 [RFC] archive: behavior of --prefix with absolute or parent path components Pushkar Singh 2026-04-07 19:24 ` Jeff King @ 2026-04-08 16:00 ` Pushkar Singh 2026-04-08 17:40 ` Jeff King 1 sibling, 1 reply; 6+ messages in thread From: Pushkar Singh @ 2026-04-08 16:00 UTC (permalink / raw) To: pushkarkumarsingh1970; +Cc: git, gitster, peff Clarify that --prefix is used as given and is not normalized, and may include leading slashes or parent directory components. Signed-off-by: Pushkar Singh <pushkarkumarsingh1970@gmail.com> --- Documentation/git-archive.adoc | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/Documentation/git-archive.adoc b/Documentation/git-archive.adoc index a0e3fe7996..086bade6d8 100644 --- a/Documentation/git-archive.adoc +++ b/Documentation/git-archive.adoc @@ -54,6 +54,11 @@ OPTIONS Prepend <prefix>/ to paths in the archive. Can be repeated; its rightmost value is used for all tracked files. See below which value gets used by `--add-file`. ++ +The <prefix> is used as given and is not normalized. It may +include leading slashes or parent directory components (e.g., +`../`). Some archive consumers may treat such paths as +potentially unsafe and adjust or warn during extraction. -o <file>:: --output=<file>:: -- 2.53.0.582.gca1db8a0f7 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] archive: document --prefix handling of absolute and parent paths 2026-04-08 16:00 ` [PATCH] archive: document --prefix handling of absolute and parent paths Pushkar Singh @ 2026-04-08 17:40 ` Jeff King 0 siblings, 0 replies; 6+ messages in thread From: Jeff King @ 2026-04-08 17:40 UTC (permalink / raw) To: Pushkar Singh; +Cc: git, gitster On Wed, Apr 08, 2026 at 04:00:06PM +0000, Pushkar Singh wrote: > Prepend <prefix>/ to paths in the archive. Can be repeated; its > rightmost value is used for all tracked files. See below which > value gets used by `--add-file`. > ++ > +The <prefix> is used as given and is not normalized. It may > +include leading slashes or parent directory components (e.g., > +`../`). Some archive consumers may treat such paths as > +potentially unsafe and adjust or warn during extraction. Thanks, this reads fine to me. -Peff ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-04-08 17:40 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-04-07 16:21 [RFC] archive: behavior of --prefix with absolute or parent path components Pushkar Singh 2026-04-07 19:24 ` Jeff King 2026-04-07 19:57 ` Junio C Hamano 2026-04-07 22:24 ` brian m. carlson 2026-04-08 16:00 ` [PATCH] archive: document --prefix handling of absolute and parent paths Pushkar Singh 2026-04-08 17:40 ` Jeff King
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox