From: John Keeping <john@keeping.me.uk>
To: Jan Vales <jan@jvales.net>
Cc: git@vger.kernel.org
Subject: Re: unexplained behavior/issue with git archive?
Date: Thu, 23 Jul 2015 16:59:36 +0100 [thread overview]
Message-ID: <20150723155936.GC14935@serenity.lan> (raw)
In-Reply-To: <55B10705.6090303@jvales.net>
On Thu, Jul 23, 2015 at 05:23:49PM +0200, Jan Vales wrote:
> i seem to trigger behavior i do not understand with git archive.
>
> I have this little 3 liner (vmdiff.sh):
> #!/bin/bash
> git diff --name-status "$2" "$3" > "$1.files"
> git diff --name-only "$2" "$3" |xargs -d'\n' git archive -o "$1" "$3" --
>
>
> For testing purpose, lets assume this call:
> # ./vmdiff.sh latest.zip HEAD^1 HEAD
>
> # cat latest.zip.files | wc -l
> 149021
>
> # cat latest.zip.files | egrep "^D" | wc -l
> 159
>
> # mkdir empty; cd empty; unzip latest.zip ; find * | wc -l
> 1090
>
> My goal is to basically diff (parts of) filesystems against each other
> and create an archive with all changed files + a file list to know what
> files were deleted. (I currently do not care about the files
> permissions+ownership, and it doesnt really matter in the current
> problem. Also dont ask, why one would store a root-filesystem in git :)
>
> What I do not understand: why does the zip file only contains 1090
> files+dirs if the wc -l shows like 150k files and only like 159 were
> deleted?
> There should be like 149k files in that archive.
>
> Also only the few files are all from "var" and none from etc or srv
> where definitely files changed in too! (and show up in latest.zip.files)
>
> Is there a limit of files git archive can process?
Not explicitly, but there is a limit on the size of command lines and
xargs will invoke the command multiple times if enough arguments are
given.
What happens if you do:
git diff --name-only HEAD^ HEAD | xargs -d'\n' echo | wc -l
?
With a small number of items, there should only be one output line, but
if xargs invokes the command multiple times there will be multiple
lines. For example (using -L2 to force a maximum of two arguments per
invocation):
$ printf '%s\n' a b c | xargs -d'\n' echo | wc -l
1
$ printf '%s\n' a b c | xargs -d'\n' -L2 echo | wc -l
2
next prev parent reply other threads:[~2015-07-23 15:59 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-23 15:23 unexplained behavior/issue with git archive? Jan Vales
2015-07-23 15:59 ` John Keeping [this message]
2015-07-23 17:21 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150723155936.GC14935@serenity.lan \
--to=john@keeping.me.uk \
--cc=git@vger.kernel.org \
--cc=jan@jvales.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).