* how to do directory renames in fast-import
@ 2007-07-10 1:09 David Frech
2007-07-10 3:10 ` [PATCH] Support wholesale " Shawn O. Pearce
0 siblings, 1 reply; 10+ messages in thread
From: David Frech @ 2007-07-10 1:09 UTC (permalink / raw)
To: git
Git can track file renames implicitly. If I delete and then add (under
a different name) the same content, git will figure that out.
But if a directory was renamed, I have no way to tell fast-import
about it. I can't delete the directory (using a 'D' command) and then
add it back (with a different name) with all its contents, because my
source material (an svn dump file) doesn't tell me, at that point,
about all the files involved because nothing about them has changed.
fast-import knows about the contents of the directory I want to
rename, but doesn't give me a primitive to do the rename. Is this
something we need to add? My frontend could keep track of this, but I
would duplicating work that fast-import is already doing.
Cheers,
- David
--
If I have not seen farther, it is because I have stood in the
footsteps of giants.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH] Support wholesale directory renames in fast-import
2007-07-10 1:09 how to do directory renames in fast-import David Frech
@ 2007-07-10 3:10 ` Shawn O. Pearce
2007-07-10 4:16 ` David Frech
2007-07-10 8:44 ` Rogan Dawes
0 siblings, 2 replies; 10+ messages in thread
From: Shawn O. Pearce @ 2007-07-10 3:10 UTC (permalink / raw)
To: David Frech; +Cc: git
Some source material (e.g. Subversion dump files) perform directory
renames without telling us exactly which files in that subdirectory
were moved. This makes it hard for a frontend to convert such data
formats to a fast-import stream, as all the frontend has on hand
is "Rename a/ to b/" with no details about what files are in a/,
unless the frontend also kept track of all files.
The new 'R' subcommand within a commit allows the frontend to
rename either a file or an entire subdirectory, without needing to
know the object's SHA-1 or the specific files contained within it.
The rename is performed as efficiently as possible internally,
making it cheaper than a 'D'/'M' pair for a file rename.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
David Frech <david@nimblemachines.com> wrote:
> Git can track file renames implicitly. If I delete and then add (under
> a different name) the same content, git will figure that out.
>
> But if a directory was renamed, I have no way to tell fast-import
> about it. I can't delete the directory (using a 'D' command) and then
> add it back (with a different name) with all its contents, because my
> source material (an svn dump file) doesn't tell me, at that point,
> about all the files involved because nothing about them has changed.
>
> fast-import knows about the contents of the directory I want to
> rename, but doesn't give me a primitive to do the rename. Is this
> something we need to add? My frontend could keep track of this, but I
> would duplicating work that fast-import is already doing.
Does the following do the trick for you? It is also available
from my fastimport.git master branch:
git://repo.or.cz/git/fastimport.git master
http://repo.or.cz/r/git/fastimport.git master
Yes, it passes all tests...
Documentation/git-fast-import.txt | 28 ++++++++++-
fast-import.c | 91 ++++++++++++++++++++++++++++++-------
t/t9300-fast-import.sh | 68 +++++++++++++++++++++++++++
3 files changed, 168 insertions(+), 19 deletions(-)
diff --git a/Documentation/git-fast-import.txt b/Documentation/git-fast-import.txt
index c66af7c..80a8ee0 100644
--- a/Documentation/git-fast-import.txt
+++ b/Documentation/git-fast-import.txt
@@ -302,7 +302,7 @@ change to the project.
data
('from' SP <committish> LF)?
('merge' SP <committish> LF)?
- (filemodify | filedelete | filedeleteall)*
+ (filemodify | filedelete | filerename | filedeleteall)*
LF
....
@@ -325,11 +325,13 @@ commit message use a 0 length data. Commit messages are free-form
and are not interpreted by Git. Currently they must be encoded in
UTF-8, as fast-import does not permit other encodings to be specified.
-Zero or more `filemodify`, `filedelete` and `filedeleteall` commands
+Zero or more `filemodify`, `filedelete`, `filename` and
+`filedeleteall` commands
may be included to update the contents of the branch prior to
creating the commit. These commands may be supplied in any order.
However it is recommended that a `filedeleteall` command preceed
-all `filemodify` commands in the same commit, as `filedeleteall`
+all `filemodify` and `filerename` commands in the same commit, as
+`filedeleteall`
wipes the branch clean (see below).
`author`
@@ -495,6 +497,26 @@ here `<path>` is the complete path of the file or subdirectory to
be removed from the branch.
See `filemodify` above for a detailed description of `<path>`.
+`filerename`
+^^^^^^^^^^^^
+Renames an existing file or subdirectory to a different location
+within the branch. The existing file or directory must exist. If
+the destination exists it will be replaced by the source directory.
+
+....
+ 'R' SP <path> SP <path> LF
+....
+
+here the first `<path>` is the source location and the second
+`<path>` is the destination. See `filemodify` above for a detailed
+description of what `<path>` may look like. To use a source path
+that contains SP the path must be quoted.
+
+A `filerename` command takes effect immediately. Once the source
+location has been renamed to the destination any future commands
+applied to the source location will create new files there and not
+impact the destination of the rename.
+
`filedeleteall`
^^^^^^^^^^^^^^^
Included in a `commit` command to remove all files (and also all
diff --git a/fast-import.c b/fast-import.c
index f9bfcc7..a1cb13f 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -26,9 +26,10 @@ Format of STDIN stream:
lf;
commit_msg ::= data;
- file_change ::= file_clr | file_del | file_obm | file_inm;
+ file_change ::= file_clr | file_del | file_rnm | file_obm | file_inm;
file_clr ::= 'deleteall' lf;
file_del ::= 'D' sp path_str lf;
+ file_rnm ::= 'R' sp path_str sp path_str lf;
file_obm ::= 'M' sp mode sp (hexsha1 | idnum) sp path_str lf;
file_inm ::= 'M' sp mode sp 'inline' sp path_str lf
data;
@@ -1154,7 +1155,8 @@ static int tree_content_set(
struct tree_entry *root,
const char *p,
const unsigned char *sha1,
- const uint16_t mode)
+ const uint16_t mode,
+ struct tree_content *subtree)
{
struct tree_content *t = root->tree;
const char *slash1;
@@ -1168,20 +1170,22 @@ static int tree_content_set(
n = strlen(p);
if (!n)
die("Empty path component found in input");
+ if (!slash1 && !S_ISDIR(mode) && subtree)
+ die("Non-directories cannot have subtrees");
for (i = 0; i < t->entry_count; i++) {
e = t->entries[i];
if (e->name->str_len == n && !strncmp(p, e->name->str_dat, n)) {
if (!slash1) {
- if (e->versions[1].mode == mode
+ if (!S_ISDIR(mode)
+ && e->versions[1].mode == mode
&& !hashcmp(e->versions[1].sha1, sha1))
return 0;
e->versions[1].mode = mode;
hashcpy(e->versions[1].sha1, sha1);
- if (e->tree) {
+ if (e->tree)
release_tree_content_recursive(e->tree);
- e->tree = NULL;
- }
+ e->tree = subtree;
hashclr(root->versions[1].sha1);
return 1;
}
@@ -1191,7 +1195,7 @@ static int tree_content_set(
}
if (!e->tree)
load_tree(e);
- if (tree_content_set(e, slash1 + 1, sha1, mode)) {
+ if (tree_content_set(e, slash1 + 1, sha1, mode, subtree)) {
hashclr(root->versions[1].sha1);
return 1;
}
@@ -1209,9 +1213,9 @@ static int tree_content_set(
if (slash1) {
e->tree = new_tree_content(8);
e->versions[1].mode = S_IFDIR;
- tree_content_set(e, slash1 + 1, sha1, mode);
+ tree_content_set(e, slash1 + 1, sha1, mode, subtree);
} else {
- e->tree = NULL;
+ e->tree = subtree;
e->versions[1].mode = mode;
hashcpy(e->versions[1].sha1, sha1);
}
@@ -1219,7 +1223,10 @@ static int tree_content_set(
return 1;
}
-static int tree_content_remove(struct tree_entry *root, const char *p)
+static int tree_content_remove(
+ struct tree_entry *root,
+ const char *p,
+ struct tree_entry *backup_leaf)
{
struct tree_content *t = root->tree;
const char *slash1;
@@ -1239,13 +1246,14 @@ static int tree_content_remove(struct tree_entry *root, const char *p)
goto del_entry;
if (!e->tree)
load_tree(e);
- if (tree_content_remove(e, slash1 + 1)) {
+ if (tree_content_remove(e, slash1 + 1, backup_leaf)) {
for (n = 0; n < e->tree->entry_count; n++) {
if (e->tree->entries[n]->versions[1].mode) {
hashclr(root->versions[1].sha1);
return 1;
}
}
+ backup_leaf = NULL;
goto del_entry;
}
return 0;
@@ -1254,10 +1262,11 @@ static int tree_content_remove(struct tree_entry *root, const char *p)
return 0;
del_entry:
- if (e->tree) {
+ if (backup_leaf)
+ memcpy(backup_leaf, e, sizeof(*backup_leaf));
+ else if (e->tree)
release_tree_content_recursive(e->tree);
- e->tree = NULL;
- }
+ e->tree = NULL;
e->versions[1].mode = 0;
hashclr(e->versions[1].sha1);
hashclr(root->versions[1].sha1);
@@ -1629,7 +1638,7 @@ static void file_change_m(struct branch *b)
typename(type), command_buf.buf);
}
- tree_content_set(&b->branch_tree, p, sha1, S_IFREG | mode);
+ tree_content_set(&b->branch_tree, p, sha1, S_IFREG | mode, NULL);
free(p_uq);
}
@@ -1645,10 +1654,58 @@ static void file_change_d(struct branch *b)
die("Garbage after path in: %s", command_buf.buf);
p = p_uq;
}
- tree_content_remove(&b->branch_tree, p);
+ tree_content_remove(&b->branch_tree, p, NULL);
free(p_uq);
}
+static void file_change_r(struct branch *b)
+{
+ const char *s, *d;
+ char *s_uq, *d_uq;
+ const char *endp;
+ struct tree_entry leaf;
+
+ s = command_buf.buf + 2;
+ s_uq = unquote_c_style(s, &endp);
+ if (s_uq) {
+ if (*endp != ' ')
+ die("Missing space after source: %s", command_buf.buf);
+ }
+ else {
+ endp = strchr(s, ' ');
+ if (!endp)
+ die("Missing space after source: %s", command_buf.buf);
+ s_uq = xmalloc(endp - s + 1);
+ memcpy(s_uq, s, endp - s);
+ s_uq[endp - s] = 0;
+ }
+ s = s_uq;
+
+ endp++;
+ if (!*endp)
+ die("Missing dest: %s", command_buf.buf);
+
+ d = endp;
+ d_uq = unquote_c_style(d, &endp);
+ if (d_uq) {
+ if (*endp)
+ die("Garbage after dest in: %s", command_buf.buf);
+ d = d_uq;
+ }
+
+ memset(&leaf, 0, sizeof(leaf));
+ tree_content_remove(&b->branch_tree, s, &leaf);
+ if (!leaf.versions[1].mode)
+ die("Path %s not in branch", s);
+ tree_content_set(&b->branch_tree, d,
+ leaf.versions[1].sha1,
+ leaf.versions[1].mode,
+ leaf.tree);
+
+ free(s_uq);
+ free(d_uq);
+}
+
static void file_change_deleteall(struct branch *b)
{
release_tree_content_recursive(b->branch_tree.tree);
@@ -1816,6 +1873,8 @@ static void cmd_new_commit(void)
file_change_m(b);
else if (!prefixcmp(command_buf.buf, "D "))
file_change_d(b);
+ else if (!prefixcmp(command_buf.buf, "R "))
+ file_change_r(b);
else if (!strcmp("deleteall", command_buf.buf))
file_change_deleteall(b);
else
diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh
index 53774c8..bf3720d 100755
--- a/t/t9300-fast-import.sh
+++ b/t/t9300-fast-import.sh
@@ -580,4 +580,72 @@ test_expect_success \
git diff --raw L^ L >output &&
git diff expect output'
+###
+### series M
+###
+
+test_tick
+cat >input <<INPUT_END
+commit refs/heads/M1
+committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
+data <<COMMIT
+file rename
+COMMIT
+
+from refs/heads/branch^0
+R file2/newf file2/n.e.w.f
+
+INPUT_END
+
+cat >expect <<EOF
+:100755 100755 f1fb5da718392694d0076d677d6d0e364c79b0bc f1fb5da718392694d0076d677d6d0e364c79b0bc R100 file2/newf file2/n.e.w.f
+EOF
+test_expect_success \
+ 'M: rename file in same subdirectory' \
+ 'git-fast-import <input &&
+ git diff-tree -M -r M1^ M1 >actual &&
+ compare_diff_raw expect actual'
+
+cat >input <<INPUT_END
+commit refs/heads/M2
+committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
+data <<COMMIT
+file rename
+COMMIT
+
+from refs/heads/branch^0
+R file2/newf i/am/new/to/you
+
+INPUT_END
+
+cat >expect <<EOF
+:100755 100755 f1fb5da718392694d0076d677d6d0e364c79b0bc f1fb5da718392694d0076d677d6d0e364c79b0bc R100 file2/newf i/am/new/to/you
+EOF
+test_expect_success \
+ 'M: rename file to new subdirectory' \
+ 'git-fast-import <input &&
+ git diff-tree -M -r M2^ M2 >actual &&
+ compare_diff_raw expect actual'
+
+cat >input <<INPUT_END
+commit refs/heads/M3
+committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
+data <<COMMIT
+file rename
+COMMIT
+
+from refs/heads/M2^0
+R i other/sub
+
+INPUT_END
+
+cat >expect <<EOF
+:100755 100755 f1fb5da718392694d0076d677d6d0e364c79b0bc f1fb5da718392694d0076d677d6d0e364c79b0bc R100 i/am/new/to/you other/sub/am/new/to/you
+EOF
+test_expect_success \
+ 'M: rename subdirectory to new subdirectory' \
+ 'git-fast-import <input &&
+ git diff-tree -M -r M3^ M3 >actual &&
+ compare_diff_raw expect actual'
+
test_done
--
1.5.3.rc0.879.g64b8
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] Support wholesale directory renames in fast-import
2007-07-10 3:10 ` [PATCH] Support wholesale " Shawn O. Pearce
@ 2007-07-10 4:16 ` David Frech
2007-07-10 14:03 ` Uwe Kleine-König
2007-07-10 8:44 ` Rogan Dawes
1 sibling, 1 reply; 10+ messages in thread
From: David Frech @ 2007-07-10 4:16 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: git
This should do nicely! Thank you!
Now my challenge is that the svn dump doesn't *actually* say "rename
a/ to b/"; it says "copy a/ to b/; delete a/", so I have to infer the
rename.
But your patch makes my import possible, and it wasn't before!
Cheers,
- David
On 7/9/07, Shawn O. Pearce <spearce@spearce.org> wrote:
> Some source material (e.g. Subversion dump files) perform directory
> renames without telling us exactly which files in that subdirectory
> were moved. This makes it hard for a frontend to convert such data
> formats to a fast-import stream, as all the frontend has on hand
> is "Rename a/ to b/" with no details about what files are in a/,
> unless the frontend also kept track of all files.
>
> The new 'R' subcommand within a commit allows the frontend to
> rename either a file or an entire subdirectory, without needing to
> know the object's SHA-1 or the specific files contained within it.
> The rename is performed as efficiently as possible internally,
> making it cheaper than a 'D'/'M' pair for a file rename.
>
> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
> ---
>
> David Frech <david@nimblemachines.com> wrote:
> > Git can track file renames implicitly. If I delete and then add (under
> > a different name) the same content, git will figure that out.
> >
> > But if a directory was renamed, I have no way to tell fast-import
> > about it. I can't delete the directory (using a 'D' command) and then
> > add it back (with a different name) with all its contents, because my
> > source material (an svn dump file) doesn't tell me, at that point,
> > about all the files involved because nothing about them has changed.
> >
> > fast-import knows about the contents of the directory I want to
> > rename, but doesn't give me a primitive to do the rename. Is this
> > something we need to add? My frontend could keep track of this, but I
> > would duplicating work that fast-import is already doing.
>
> Does the following do the trick for you? It is also available
> from my fastimport.git master branch:
>
> git://repo.or.cz/git/fastimport.git master
> http://repo.or.cz/r/git/fastimport.git master
>
> Yes, it passes all tests...
>
[patch elided]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Support wholesale directory renames in fast-import
2007-07-10 3:10 ` [PATCH] Support wholesale " Shawn O. Pearce
2007-07-10 4:16 ` David Frech
@ 2007-07-10 8:44 ` Rogan Dawes
2007-07-10 13:55 ` Shawn O. Pearce
1 sibling, 1 reply; 10+ messages in thread
From: Rogan Dawes @ 2007-07-10 8:44 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: David Frech, git
Shawn O. Pearce wrote:
> -Zero or more `filemodify`, `filedelete` and `filedeleteall` commands
> +Zero or more `filemodify`, `filedelete`, `filename` and
^^ filerename
Rogan
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Support wholesale directory renames in fast-import
2007-07-10 8:44 ` Rogan Dawes
@ 2007-07-10 13:55 ` Shawn O. Pearce
0 siblings, 0 replies; 10+ messages in thread
From: Shawn O. Pearce @ 2007-07-10 13:55 UTC (permalink / raw)
To: Rogan Dawes; +Cc: David Frech, git
Rogan Dawes <lists@dawes.za.net> wrote:
> Shawn O. Pearce wrote:
> >-Zero or more `filemodify`, `filedelete` and `filedeleteall` commands
> >+Zero or more `filemodify`, `filedelete`, `filename` and
> ^^ filerename
Ugh. Thanks. I just pushed out a corrected version.
--
Shawn.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Support wholesale directory renames in fast-import
2007-07-10 4:16 ` David Frech
@ 2007-07-10 14:03 ` Uwe Kleine-König
2007-07-10 14:14 ` Shawn O. Pearce
0 siblings, 1 reply; 10+ messages in thread
From: Uwe Kleine-König @ 2007-07-10 14:03 UTC (permalink / raw)
To: David Frech; +Cc: Shawn O. Pearce, git
Hallo David,
David Frech wrote:
> Now my challenge is that the svn dump doesn't *actually* say "rename
> a/ to b/"; it says "copy a/ to b/; delete a/", so I have to infer the
> rename.
I don't know fast-import very well, but why not doing exactly what the
dump file suggests: copy a b; delete a ?
Best regards
Uwe
--
Uwe Kleine-König
http://www.google.com/search?q=12+divided+by+3
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Support wholesale directory renames in fast-import
2007-07-10 14:03 ` Uwe Kleine-König
@ 2007-07-10 14:14 ` Shawn O. Pearce
2007-07-10 19:55 ` David Frech
0 siblings, 1 reply; 10+ messages in thread
From: Shawn O. Pearce @ 2007-07-10 14:14 UTC (permalink / raw)
To: Uwe Kleine-König; +Cc: David Frech, git
Uwe Kleine-K??nig <ukleinek@informatik.uni-freiburg.de> wrote:
> David Frech wrote:
> > Now my challenge is that the svn dump doesn't *actually* say "rename
> > a/ to b/"; it says "copy a/ to b/; delete a/", so I have to infer the
> > rename.
>
> I don't know fast-import very well, but why not doing exactly what the
> dump file suggests: copy a b; delete a ?
Because there is no copy operator in fast-import. So you cannot
do "copy a b". Apparently that's what I should have implemented,
as rename in Git really is as simple as the copy/delete pair. Ugh.
Copy isn't really that hard, it just can't be nearly as efficient as
rename, as copying a subtree will force me to either duplicate data
in memory or reload trees from disk to duplicate data in memory.
But its a copy, so data duplication is expected. ;-)
I'll implement a copy opertor soon. Shouldn't be too difficult.
Maybe someone else would like to take a shot at implementing it...
--
Shawn.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Support wholesale directory renames in fast-import
2007-07-10 14:14 ` Shawn O. Pearce
@ 2007-07-10 19:55 ` David Frech
2007-07-11 7:57 ` Shawn O. Pearce
0 siblings, 1 reply; 10+ messages in thread
From: David Frech @ 2007-07-10 19:55 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: Uwe Kleine-König, git
Hmm. I think Uwe is right. Copy is probably the "right" primitive, and
rename can always be synthesized from copy+delete.
Since Subversion is built around the idea of "cheap copies" there is
no incentive for them to represent renames other than as "copy, then
delete".
But isn't the same true in a way of git? If I copy a directory (a
tree), then the new tree is the same tree - it has the same SHA-1
hash, so I can simply refer to the existing object. Same for file
blobs.
Subversion dump files have *lots* of copies. Might be nice to be able
to feed these directly into fast-import and have it DTRT, esp if it
was smart about sharing identical data structures.
- David
On 7/10/07, Shawn O. Pearce <spearce@spearce.org> wrote:
> Uwe Kleine-K??nig <ukleinek@informatik.uni-freiburg.de> wrote:
> > David Frech wrote:
> > > Now my challenge is that the svn dump doesn't *actually* say "rename
> > > a/ to b/"; it says "copy a/ to b/; delete a/", so I have to infer the
> > > rename.
> >
> > I don't know fast-import very well, but why not doing exactly what the
> > dump file suggests: copy a b; delete a ?
>
> Because there is no copy operator in fast-import. So you cannot
> do "copy a b". Apparently that's what I should have implemented,
> as rename in Git really is as simple as the copy/delete pair. Ugh.
>
> Copy isn't really that hard, it just can't be nearly as efficient as
> rename, as copying a subtree will force me to either duplicate data
> in memory or reload trees from disk to duplicate data in memory.
> But its a copy, so data duplication is expected. ;-)
>
> I'll implement a copy opertor soon. Shouldn't be too difficult.
> Maybe someone else would like to take a shot at implementing it...
>
> --
> Shawn.
>
--
If I have not seen farther, it is because I have stood in the
footsteps of giants.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Support wholesale directory renames in fast-import
2007-07-10 19:55 ` David Frech
@ 2007-07-11 7:57 ` Shawn O. Pearce
2007-07-11 23:11 ` David Frech
0 siblings, 1 reply; 10+ messages in thread
From: Shawn O. Pearce @ 2007-07-11 7:57 UTC (permalink / raw)
To: David Frech; +Cc: Uwe Kleine-König, git
David Frech <david@nimblemachines.com> wrote:
> Hmm. I think Uwe is right. Copy is probably the "right" primitive, and
> rename can always be synthesized from copy+delete.
>
> Since Subversion is built around the idea of "cheap copies" there is
> no incentive for them to represent renames other than as "copy, then
> delete".
>
> But isn't the same true in a way of git? If I copy a directory (a
> tree), then the new tree is the same tree - it has the same SHA-1
> hash, so I can simply refer to the existing object. Same for file
> blobs.
>
> Subversion dump files have *lots* of copies. Might be nice to be able
> to feed these directly into fast-import and have it DTRT, esp if it
> was smart about sharing identical data structures.
Yes. All of that is true. ;-)
I'm tired. I just worked an 18 hour day. I need to go do it all
over again in about 4 hours. So I'm going to head off to bed. But
I did manage to implement this (I think). Its totally untested.
But feel free to poke at it:
git://repo.or.cz/git/fastimport.git copy-wip
I'll write documentation and unit tests tomorrow. And fix any bugs,
if any get identified.
The implementation should copy as little memory as possible to do the
actual copy. This should make a C/D pair about as efficient as an
R command if the directory being copied has not yet been modified
as part of the current commit (this is probably typical for an
SVN dump file). The only difference should be a slight increase
in running time for the C/D pair, as directory entry lookup in
fast-import is O(n).
Oh, and as always, C works with both files and directories...
Hmm. Quickly reading this diff I can actually do it shorter with
a bit of refactoring. I'll clean that up tomorrow night. I blame
it on the lack of sleep that I'm suffering from right now. ;-)
-->8--
WIP Teach fast import how to copy
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
fast-import.c | 121 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
1 files changed, 120 insertions(+), 1 deletions(-)
diff --git a/fast-import.c b/fast-import.c
index a1cb13f..41c0352 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -26,10 +26,16 @@ Format of STDIN stream:
lf;
commit_msg ::= data;
- file_change ::= file_clr | file_del | file_rnm | file_obm | file_inm;
+ file_change ::= file_clr
+ | file_del
+ | file_rnm
+ | file_cpy
+ | file_obm
+ | file_inm;
file_clr ::= 'deleteall' lf;
file_del ::= 'D' sp path_str lf;
file_rnm ::= 'R' sp path_str sp path_str lf;
+ file_cpy ::= 'C' sp path_str sp path_str lf;
file_obm ::= 'M' sp mode sp (hexsha1 | idnum) sp path_str lf;
file_inm ::= 'M' sp mode sp 'inline' sp path_str lf
data;
@@ -623,6 +629,33 @@ static void release_tree_entry(struct tree_entry *e)
avail_tree_entry = e;
}
+static struct tree_content *dup_tree_content(struct tree_content *s)
+{
+ struct tree_content *d;
+ struct tree_entry *a, *b;
+ unsigned int i, j;
+
+ if (!s)
+ return NULL;
+ d = new_tree_content(s->entry_count);
+ for (i = 0, j = 0; i < s->entry_count; i++) {
+ a = s->entries[i];
+ if (a->versions[1].mode) {
+ b = new_tree_entry();
+ memcpy(b, a, sizeof(*a));
+ if (is_null_sha1(b->versions[1].sha1))
+ b->tree = dup_tree_content(b->tree);
+ else
+ b->tree = NULL;
+ d->entries[j++] = a;
+ }
+ }
+ d->entry_count = j;
+ d->delta_depth = s->delta_depth;
+
+ return d;
+}
+
static void start_packfile(void)
{
static char tmpfile[PATH_MAX];
@@ -1273,6 +1306,43 @@ del_entry:
return 1;
}
+static int tree_content_get(
+ struct tree_entry *root,
+ const char *p,
+ struct tree_entry *leaf)
+{
+ struct tree_content *t = root->tree;
+ const char *slash1;
+ unsigned int i, n;
+ struct tree_entry *e;
+
+ slash1 = strchr(p, '/');
+ if (slash1)
+ n = slash1 - p;
+ else
+ n = strlen(p);
+
+ for (i = 0; i < t->entry_count; i++) {
+ e = t->entries[i];
+ if (e->name->str_len == n && !strncmp(p, e->name->str_dat, n)) {
+ if (!slash1) {
+ memcpy(leaf, e, sizeof(*leaf));
+ if (is_null_sha1(e->versions[1].sha1))
+ leaf->tree = dup_tree_content(leaf->tree);
+ else
+ leaf->tree = NULL;
+ return 1;
+ }
+ if (!S_ISDIR(e->versions[1].mode))
+ return 0;
+ if (!e->tree)
+ load_tree(e);
+ return tree_content_get(e, slash1 + 1, leaf);
+ }
+ }
+ return 0;
+}
+
static int update_branch(struct branch *b)
{
static const char *msg = "fast-import";
@@ -1706,6 +1776,53 @@ static void file_change_r(struct branch *b)
free(d_uq);
}
+static void file_change_c(struct branch *b)
+{
+ const char *s, *d;
+ char *s_uq, *d_uq;
+ const char *endp;
+ struct tree_entry leaf;
+
+ s = command_buf.buf + 2;
+ s_uq = unquote_c_style(s, &endp);
+ if (s_uq) {
+ if (*endp != ' ')
+ die("Missing space after source: %s", command_buf.buf);
+ }
+ else {
+ endp = strchr(s, ' ');
+ if (!endp)
+ die("Missing space after source: %s", command_buf.buf);
+ s_uq = xmalloc(endp - s + 1);
+ memcpy(s_uq, s, endp - s);
+ s_uq[endp - s] = 0;
+ }
+ s = s_uq;
+
+ endp++;
+ if (!*endp)
+ die("Missing dest: %s", command_buf.buf);
+
+ d = endp;
+ d_uq = unquote_c_style(d, &endp);
+ if (d_uq) {
+ if (*endp)
+ die("Garbage after dest in: %s", command_buf.buf);
+ d = d_uq;
+ }
+
+ memset(&leaf, 0, sizeof(leaf));
+ if (!tree_content_get(&b->branch_tree, s, &leaf))
+ die("Path %s not in branch", s);
+ tree_content_set(&b->branch_tree, d,
+ leaf.versions[1].sha1,
+ leaf.versions[1].mode,
+ leaf.tree);
+
+ free(s_uq);
+ free(d_uq);
+}
+
static void file_change_deleteall(struct branch *b)
{
release_tree_content_recursive(b->branch_tree.tree);
@@ -1875,6 +1992,8 @@ static void cmd_new_commit(void)
file_change_d(b);
else if (!prefixcmp(command_buf.buf, "R "))
file_change_r(b);
+ else if (!prefixcmp(command_buf.buf, "C "))
+ file_change_c(b);
else if (!strcmp("deleteall", command_buf.buf))
file_change_deleteall(b);
else
--
1.5.3.rc0.879.g64b8
--
Shawn.
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] Support wholesale directory renames in fast-import
2007-07-11 7:57 ` Shawn O. Pearce
@ 2007-07-11 23:11 ` David Frech
0 siblings, 0 replies; 10+ messages in thread
From: David Frech @ 2007-07-11 23:11 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: git
On 7/11/07, Shawn O. Pearce <spearce@spearce.org> wrote:
> I'm tired. I just worked an 18 hour day. I need to go do it all
> over again in about 4 hours. So I'm going to head off to bed. But
> I did manage to implement this (I think). Its totally untested.
> But feel free to poke at it:
>
> git://repo.or.cz/git/fastimport.git copy-wip
>
> I'll write documentation and unit tests tomorrow. And fix any bugs,
> if any get identified.
Shawn, don't knock yourself out. ;-)
I think getting the semantics right (of the command set in
fast-import) is important, but it doesn't have to be right
*yesterday*. I won't need to point my command stream at fast-import
for at least another day or two. ;-) I have some subtle code to write
first that'll take me a bit to get right...
Thanks for your enthusiasm though!
Cheers,
- David
> --
> Shawn.
>
--
If I have not seen farther, it is because I have stood in the
footsteps of giants.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2007-07-11 23:11 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-07-10 1:09 how to do directory renames in fast-import David Frech
2007-07-10 3:10 ` [PATCH] Support wholesale " Shawn O. Pearce
2007-07-10 4:16 ` David Frech
2007-07-10 14:03 ` Uwe Kleine-König
2007-07-10 14:14 ` Shawn O. Pearce
2007-07-10 19:55 ` David Frech
2007-07-11 7:57 ` Shawn O. Pearce
2007-07-11 23:11 ` David Frech
2007-07-10 8:44 ` Rogan Dawes
2007-07-10 13:55 ` Shawn O. Pearce
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).