* Re: Is there a scriptable way to update the stat-info in the index without having git open and read those files?
2011-08-22 22:28 Is there a scriptable way to update the stat-info in the index without having git open and read those files? Elijah Newren
@ 2011-08-22 22:49 ` Junio C Hamano
2011-08-27 15:18 ` Pete Wyckoff
1 sibling, 0 replies; 3+ messages in thread
From: Junio C Hamano @ 2011-08-22 22:49 UTC (permalink / raw)
To: Elijah Newren; +Cc: Git Mailing List
Elijah Newren <newren@gmail.com> writes:
> A little more detail, for the curious: I have a script that is, among
> other things, renaming large numbers of files. Calling 'git mv <old>
> <new>' on each pair took forever. So I switched to manually renaming
> the files in the working copy myself, and using git update-index
> --index-info to do the renames in the index. The result was _much_
> faster, but of course that method blows away all the stat information
> for the relevant files and causes any subsequent git operation (after
> my script is done) to be slow. I inserted a 'git update-index -q
> --refresh' at the end of my script to fix that, but that is much
> slower than I want since it has to re-read all the affected files to
> ensure they haven't been modified (however, it isn't as slow as
> forking many git-mv processes). I've tried to look for a way to speed
> up this update, but haven't found one.
Sounds like a one-off thing to me, and I personally do not think it is
worth the effort to add something even riskier than assume-unchanged to
solve.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Is there a scriptable way to update the stat-info in the index without having git open and read those files?
2011-08-22 22:28 Is there a scriptable way to update the stat-info in the index without having git open and read those files? Elijah Newren
2011-08-22 22:49 ` Junio C Hamano
@ 2011-08-27 15:18 ` Pete Wyckoff
1 sibling, 0 replies; 3+ messages in thread
From: Pete Wyckoff @ 2011-08-27 15:18 UTC (permalink / raw)
To: Elijah Newren; +Cc: Git Mailing List
newren@gmail.com wrote on Mon, 22 Aug 2011 16:28 -0600:
> I want to do something really close to
> git update-index -q --refresh
> However, I want it to assume the files in the working tree are
> unmodified from the index (i.e. don't waste time opening and reading
> the file) and simply update the stat information in the index to match
> the current files on disk.
>
> Yes, I know that would be unsafe if the files don't have the
> appropriate contents; I'm promising that they do have the appropriate
> contents and don't want to pay the performance penalty for git to
> verify. Is that possible?
I have the same issue in my workflow, and agree with Junio that this
is just too bizarre to put in the code. Here's the script I use,
relying on dulwich, that you might find helpful.
-- Pete
------8<------------------
#!/usr/bin/env python2.6
# git-index-clone - Update index after a volume clone
# Copyright 2010 Pete Wyckoff <pw@padd.com>
import sys
import os
from dulwich.index import Index
index_name = ".git/index"
#
# Debugging option: show the index entry for just one file name,
# e.g. git-index-clone file/name/in/tree
#
def show_entry(name):
idx = Index(index_name)
if name not in idx:
print >>sys.stderr, "No index entry", name
return
print "index", idx[name]
t = update_from_stat(idx[name], name)
print "stat ", t
#
# Stat the file, return the new tuple
#
def update_from_stat(idx, name):
(ctime, mtime, dev, ino, mode, uid, gid, size, sha, flags) = idx
sb = os.lstat(name)
# times are float; dulwich converts to (sec, ns) on write
ctime = sb.st_ctime
mtime = sb.st_mtime
dev = sb.st_dev
ino = sb.st_ino
# assume mode unchanged
uid = sb.st_uid
gid = sb.st_gid
# assume size, sha, flags unchanged
return (ctime, mtime, dev, ino, mode, uid, gid, size, sha, flags)
def convert():
idx_in = Index(index_name)
os.unlink(index_name)
idx_out = Index(index_name)
for name in idx_in:
idx_out[name] = update_from_stat(idx_in[name], name)
idx_out.write()
os.chmod(index_name, 0644) # drop exec perms
def usage():
print >>sys.stderr, "Usage: %s [<index entry name>]\n"
sys.exit(1)
def main():
if len(sys.argv) == 1:
convert()
elif len(sys.argv) == 2:
show_entry(sys.argv[1])
else:
usage()
return 0
if __name__ == "__main__":
sys.exit(main())
^ permalink raw reply [flat|nested] 3+ messages in thread