* Is there a scriptable way to update the stat-info in the index without having git open and read those files?
@ 2011-08-22 22:28 Elijah Newren
2011-08-22 22:49 ` Junio C Hamano
2011-08-27 15:18 ` Pete Wyckoff
0 siblings, 2 replies; 3+ messages in thread
From: Elijah Newren @ 2011-08-22 22:28 UTC (permalink / raw)
To: Git Mailing List
Hi,
I want to do something really close to
git update-index -q --refresh
However, I want it to assume the files in the working tree are
unmodified from the index (i.e. don't waste time opening and reading
the file) and simply update the stat information in the index to match
the current files on disk.
Yes, I know that would be unsafe if the files don't have the
appropriate contents; I'm promising that they do have the appropriate
contents and don't want to pay the performance penalty for git to
verify. Is that possible?
A little more detail, for the curious: I have a script that is, among
other things, renaming large numbers of files. Calling 'git mv <old>
<new>' on each pair took forever. So I switched to manually renaming
the files in the working copy myself, and using git update-index
--index-info to do the renames in the index. The result was _much_
faster, but of course that method blows away all the stat information
for the relevant files and causes any subsequent git operation (after
my script is done) to be slow. I inserted a 'git update-index -q
--refresh' at the end of my script to fix that, but that is much
slower than I want since it has to re-read all the affected files to
ensure they haven't been modified (however, it isn't as slow as
forking many git-mv processes). I've tried to look for a way to speed
up this update, but haven't found one. Did I miss it?
Thanks,
Elijah
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Is there a scriptable way to update the stat-info in the index without having git open and read those files?
2011-08-22 22:28 Is there a scriptable way to update the stat-info in the index without having git open and read those files? Elijah Newren
@ 2011-08-22 22:49 ` Junio C Hamano
2011-08-27 15:18 ` Pete Wyckoff
1 sibling, 0 replies; 3+ messages in thread
From: Junio C Hamano @ 2011-08-22 22:49 UTC (permalink / raw)
To: Elijah Newren; +Cc: Git Mailing List
Elijah Newren <newren@gmail.com> writes:
> A little more detail, for the curious: I have a script that is, among
> other things, renaming large numbers of files. Calling 'git mv <old>
> <new>' on each pair took forever. So I switched to manually renaming
> the files in the working copy myself, and using git update-index
> --index-info to do the renames in the index. The result was _much_
> faster, but of course that method blows away all the stat information
> for the relevant files and causes any subsequent git operation (after
> my script is done) to be slow. I inserted a 'git update-index -q
> --refresh' at the end of my script to fix that, but that is much
> slower than I want since it has to re-read all the affected files to
> ensure they haven't been modified (however, it isn't as slow as
> forking many git-mv processes). I've tried to look for a way to speed
> up this update, but haven't found one.
Sounds like a one-off thing to me, and I personally do not think it is
worth the effort to add something even riskier than assume-unchanged to
solve.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Is there a scriptable way to update the stat-info in the index without having git open and read those files?
2011-08-22 22:28 Is there a scriptable way to update the stat-info in the index without having git open and read those files? Elijah Newren
2011-08-22 22:49 ` Junio C Hamano
@ 2011-08-27 15:18 ` Pete Wyckoff
1 sibling, 0 replies; 3+ messages in thread
From: Pete Wyckoff @ 2011-08-27 15:18 UTC (permalink / raw)
To: Elijah Newren; +Cc: Git Mailing List
newren@gmail.com wrote on Mon, 22 Aug 2011 16:28 -0600:
> I want to do something really close to
> git update-index -q --refresh
> However, I want it to assume the files in the working tree are
> unmodified from the index (i.e. don't waste time opening and reading
> the file) and simply update the stat information in the index to match
> the current files on disk.
>
> Yes, I know that would be unsafe if the files don't have the
> appropriate contents; I'm promising that they do have the appropriate
> contents and don't want to pay the performance penalty for git to
> verify. Is that possible?
I have the same issue in my workflow, and agree with Junio that this
is just too bizarre to put in the code. Here's the script I use,
relying on dulwich, that you might find helpful.
-- Pete
------8<------------------
#!/usr/bin/env python2.6
# git-index-clone - Update index after a volume clone
# Copyright 2010 Pete Wyckoff <pw@padd.com>
import sys
import os
from dulwich.index import Index
index_name = ".git/index"
#
# Debugging option: show the index entry for just one file name,
# e.g. git-index-clone file/name/in/tree
#
def show_entry(name):
idx = Index(index_name)
if name not in idx:
print >>sys.stderr, "No index entry", name
return
print "index", idx[name]
t = update_from_stat(idx[name], name)
print "stat ", t
#
# Stat the file, return the new tuple
#
def update_from_stat(idx, name):
(ctime, mtime, dev, ino, mode, uid, gid, size, sha, flags) = idx
sb = os.lstat(name)
# times are float; dulwich converts to (sec, ns) on write
ctime = sb.st_ctime
mtime = sb.st_mtime
dev = sb.st_dev
ino = sb.st_ino
# assume mode unchanged
uid = sb.st_uid
gid = sb.st_gid
# assume size, sha, flags unchanged
return (ctime, mtime, dev, ino, mode, uid, gid, size, sha, flags)
def convert():
idx_in = Index(index_name)
os.unlink(index_name)
idx_out = Index(index_name)
for name in idx_in:
idx_out[name] = update_from_stat(idx_in[name], name)
idx_out.write()
os.chmod(index_name, 0644) # drop exec perms
def usage():
print >>sys.stderr, "Usage: %s [<index entry name>]\n"
sys.exit(1)
def main():
if len(sys.argv) == 1:
convert()
elif len(sys.argv) == 2:
show_entry(sys.argv[1])
else:
usage()
return 0
if __name__ == "__main__":
sys.exit(main())
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2011-08-27 15:18 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-08-22 22:28 Is there a scriptable way to update the stat-info in the index without having git open and read those files? Elijah Newren
2011-08-22 22:49 ` Junio C Hamano
2011-08-27 15:18 ` Pete Wyckoff
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).