From: John Keeping <john@keeping.me.uk>
To: Jan Vales <jan@jvales.net>
Cc: git@vger.kernel.org
Subject: Re: unexplained behavior/issue with git archive?
Date: Thu, 23 Jul 2015 16:59:36 +0100 [thread overview]
Message-ID: <20150723155936.GC14935@serenity.lan> (raw)
In-Reply-To: <55B10705.6090303@jvales.net>
On Thu, Jul 23, 2015 at 05:23:49PM +0200, Jan Vales wrote:
> i seem to trigger behavior i do not understand with git archive.
>
> I have this little 3 liner (vmdiff.sh):
> #!/bin/bash
> git diff --name-status "$2" "$3" > "$1.files"
> git diff --name-only "$2" "$3" |xargs -d'\n' git archive -o "$1" "$3" --
>
>
> For testing purpose, lets assume this call:
> # ./vmdiff.sh latest.zip HEAD^1 HEAD
>
> # cat latest.zip.files | wc -l
> 149021
>
> # cat latest.zip.files | egrep "^D" | wc -l
> 159
>
> # mkdir empty; cd empty; unzip latest.zip ; find * | wc -l
> 1090
>
> My goal is to basically diff (parts of) filesystems against each other
> and create an archive with all changed files + a file list to know what
> files were deleted. (I currently do not care about the files
> permissions+ownership, and it doesnt really matter in the current
> problem. Also dont ask, why one would store a root-filesystem in git :)
>
> What I do not understand: why does the zip file only contains 1090
> files+dirs if the wc -l shows like 150k files and only like 159 were
> deleted?
> There should be like 149k files in that archive.
>
> Also only the few files are all from "var" and none from etc or srv
> where definitely files changed in too! (and show up in latest.zip.files)
>
> Is there a limit of files git archive can process?
Not explicitly, but there is a limit on the size of command lines and
xargs will invoke the command multiple times if enough arguments are
given.
What happens if you do:
git diff --name-only HEAD^ HEAD | xargs -d'\n' echo | wc -l
?
With a small number of items, there should only be one output line, but
if xargs invokes the command multiple times there will be multiple
lines. For example (using -L2 to force a maximum of two arguments per
invocation):
$ printf '%s\n' a b c | xargs -d'\n' echo | wc -l
1
$ printf '%s\n' a b c | xargs -d'\n' -L2 echo | wc -l
2
next prev parent reply other threads:[~2015-07-23 15:59 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-23 15:23 unexplained behavior/issue with git archive? Jan Vales
2015-07-23 15:59 ` John Keeping [this message]
2015-07-23 17:21 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150723155936.GC14935@serenity.lan \
--to=john@keeping.me.uk \
--cc=git@vger.kernel.org \
--cc=jan@jvales.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.