All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Wong <e@yhbt.net>
To: Ivan Baldo <ibaldo@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: Fastest way to set files date and time to latest commit time of each one
Date: Sat, 29 Aug 2020 04:48:42 +0000	[thread overview]
Message-ID: <20200829044842.GA5732@dcvr> (raw)
In-Reply-To: <CAEbcw=3mOoYuJo2mQgqB2aJgn-D2i_7ZRmhfPvYNVHD1Kp8wuA@mail.gmail.com>

Ivan Baldo <ibaldo@gmail.com> wrote:
>   Hello.
>   I know this is not standard usage of git, but I need a way to have
> more stable dates and times in the files in order to avoid rsync
> checksumming.
>   So I found this
> https://stackoverflow.com/questions/2179722/checking-out-old-file-with-original-create-modified-timestamps/2179876#2179876
> and modified it a bit to run in CentOS 7:
> 
> IFS="
> "
> for FILE in $(git ls-files -z | tr '\0' '\n')
> do
>     TIME=$(git log --pretty=format:%cd -n 1 --date=iso -- "$FILE")
>     touch -c -m -d "$TIME" "$FILE"
> done
> 
>   Unfortunately it takes ages for a 84k files repo.
>   I see the CPU usage is dominated by the git log command.

running git log for each file isn't necessary.

On Debian, rsync actually ships the `git-set-file-times' script
in /usr/share/doc/rsync/scripts/ which only runs `git log' once
and parses it.

You can also get my (original) version from:
https://yhbt.net/git-set-file-times

>   I know a way I could use to split the work for all the CPU threads
> but anyway, I would like to know if you guys and girls know of a
> faster way to do this.

Much of your overhead is going to be from process spawning.
My Perl version reduces that significantly.

I haven't tried it with 84K files, but it'll have to keep all
those filenames in memory.  I'm not sure if parallelizing
utime() syscalls is worth it, either; maybe it helps on SSD
more than HDD.

  parent reply	other threads:[~2020-08-29  5:03 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-29  1:36 Fastest way to set files date and time to latest commit time of each one Ivan Baldo
2020-08-29  3:20 ` Junio C Hamano
2020-08-29  4:59   ` Raymond E. Pasco
2020-08-29  4:48 ` Eric Wong [this message]
2020-09-02 19:28   ` Ivan Baldo
2020-08-29  6:46 ` Andreas Schwab

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200829044842.GA5732@dcvr \
    --to=e@yhbt.net \
    --cc=git@vger.kernel.org \
    --cc=ibaldo@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.