* [RFC PATCH 0/3] Towards a Git-to-SVN bridge @ 2011-01-15 6:51 Ramkumar Ramachandra 2011-01-15 6:51 ` [PATCH 1/3] date: Expose the time_to_tm function Ramkumar Ramachandra ` (3 more replies) 0 siblings, 4 replies; 8+ messages in thread From: Ramkumar Ramachandra @ 2011-01-15 6:51 UTC (permalink / raw) To: Git List; +Cc: Jonathan Nieder, David Barr, Sverre Rabbelier Hi, Over the last couple of days, I've been working on a parser that converts a fast-import stream into a SVN dumpfile. So far, it's very rough and works minimally for some common fast-import commands. However, the major roadblock is persisting blobs: in this implementation, they're persisted as an array of strbufs. This is very memory-intensive and not scalable at all. With some valuable insight from Jonathan on IRC, I've decided to try re-implementing fast-export to eliminate blob marks and produce them inline instead [1]. Comments are much appreciated. [1]: http://colabti.org/irclogger/irclogger_log/git-devel?date=2011-01-14 Ramkumar Ramachandra (3): date: Expose the time_to_tm function vcs-svn: Start working on the dumpfile producer Build an svn-fi target in contrib/svn-fe Makefile | 2 +- cache.h | 1 + contrib/svn-fe/Makefile | 23 ++++- contrib/svn-fe/svn-fi.c | 16 +++ contrib/svn-fe/svn-fi.txt | 28 +++++ date.c | 2 +- vcs-svn/dump_export.c | 73 +++++++++++ vcs-svn/svnload.c | 294 +++++++++++++++++++++++++++++++++++++++++++++ 8 files changed, 435 insertions(+), 4 deletions(-) create mode 100644 contrib/svn-fe/svn-fi.c create mode 100644 contrib/svn-fe/svn-fi.txt create mode 100644 vcs-svn/dump_export.c create mode 100644 vcs-svn/svnload.c -- 1.7.4.rc1.7.g2cf08.dirty ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 1/3] date: Expose the time_to_tm function 2011-01-15 6:51 [RFC PATCH 0/3] Towards a Git-to-SVN bridge Ramkumar Ramachandra @ 2011-01-15 6:51 ` Ramkumar Ramachandra 2011-01-15 6:51 ` [PATCH 2/3] vcs-svn: Start working on the dumpfile producer Ramkumar Ramachandra ` (2 subsequent siblings) 3 siblings, 0 replies; 8+ messages in thread From: Ramkumar Ramachandra @ 2011-01-15 6:51 UTC (permalink / raw) To: Git List; +Cc: Jonathan Nieder, David Barr, Sverre Rabbelier Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> --- cache.h | 1 + date.c | 2 +- 2 files changed, 2 insertions(+), 1 deletions(-) diff --git a/cache.h b/cache.h index d83d68c..95fea31 100644 --- a/cache.h +++ b/cache.h @@ -816,6 +816,7 @@ enum date_mode { DATE_RAW }; +struct tm *time_to_tm(unsigned long time, int tz); const char *show_date(unsigned long time, int timezone, enum date_mode mode); const char *show_date_relative(unsigned long time, int tz, const struct timeval *now, diff --git a/date.c b/date.c index 00f9eb5..e601a50 100644 --- a/date.c +++ b/date.c @@ -54,7 +54,7 @@ static time_t gm_time_t(unsigned long time, int tz) * thing, which means that tz -0100 is passed in as the integer -100, * even though it means "sixty minutes off" */ -static struct tm *time_to_tm(unsigned long time, int tz) +struct tm *time_to_tm(unsigned long time, int tz) { time_t t = gm_time_t(time, tz); return gmtime(&t); -- 1.7.4.rc1.7.g2cf08.dirty ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 2/3] vcs-svn: Start working on the dumpfile producer 2011-01-15 6:51 [RFC PATCH 0/3] Towards a Git-to-SVN bridge Ramkumar Ramachandra 2011-01-15 6:51 ` [PATCH 1/3] date: Expose the time_to_tm function Ramkumar Ramachandra @ 2011-01-15 6:51 ` Ramkumar Ramachandra 2011-01-15 7:39 ` Peter Baumann 2011-01-15 6:51 ` [PATCH 3/3] Build an svn-fi target in contrib/svn-fe Ramkumar Ramachandra 2011-01-15 7:22 ` [RFC PATCH 0/3] Towards a Git-to-SVN bridge Jonathan Nieder 3 siblings, 1 reply; 8+ messages in thread From: Ramkumar Ramachandra @ 2011-01-15 6:51 UTC (permalink / raw) To: Git List; +Cc: Jonathan Nieder, David Barr, Sverre Rabbelier Start off with some broad design sketches. Compile succeeds, but parser is incorrect. Include a Makefile rule to build it into vcs-svn/lib.a. Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> --- Makefile | 2 +- vcs-svn/dump_export.c | 73 ++++++++++++ vcs-svn/svnload.c | 294 +++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 368 insertions(+), 1 deletions(-) create mode 100644 vcs-svn/dump_export.c create mode 100644 vcs-svn/svnload.c diff --git a/Makefile b/Makefile index 1345c38..40f6691 100644 --- a/Makefile +++ b/Makefile @@ -1834,7 +1834,7 @@ ifndef NO_CURL endif XDIFF_OBJS = xdiff/xdiffi.o xdiff/xprepare.o xdiff/xutils.o xdiff/xemit.o \ xdiff/xmerge.o xdiff/xpatience.o -VCSSVN_OBJS = vcs-svn/line_buffer.o \ +VCSSVN_OBJS = vcs-svn/line_buffer.o vcs-svn/svnload.o vcs-svn/dump_export.o \ vcs-svn/repo_tree.o vcs-svn/fast_export.o vcs-svn/sliding_window.o \ vcs-svn/svndiff.o vcs-svn/svndump.o VCSSVN_TEST_OBJS = test-obj-pool.o \ diff --git a/vcs-svn/dump_export.c b/vcs-svn/dump_export.c new file mode 100644 index 0000000..04ede06 --- /dev/null +++ b/vcs-svn/dump_export.c @@ -0,0 +1,73 @@ +/* + * Licensed under a two-clause BSD-style license. + * See LICENSE for details. + */ + +#include "git-compat-util.h" +#include "strbuf.h" +#include "line_buffer.h" +#include "dump_export.h" + +void dump_export_begin_rev(int revision, const char *revprops, + int prop_len) { + printf("Revision-number: %d\n", revision); + printf("Prop-content-length: %d\n", prop_len); + printf("Content-length: %d\n\n", prop_len); + printf("%s\n", revprops); +} + +void dump_export_node(const char *path, enum node_kind kind, + enum node_action action, unsigned long text_len, + unsigned long copyfrom_rev, const char *copyfrom_path) { + printf("Node-path: %s\n", path); + printf("Node-kind: "); + switch (action) { + case NODE_KIND_NORMAL: + printf("file\n"); + break; + case NODE_KIND_EXECUTABLE: + printf("file\n"); + break; + case NODE_KIND_SYMLINK: + printf("file\n"); + break; + case NODE_KIND_GITLINK: + printf("file\n"); + break; + case NODE_KIND_SUBDIR: + die("Unsupported: subdirectory"); + default: + break; + } + printf("Node-action: "); + switch (action) { + case NODE_ACTION_CHANGE: + printf("change\n"); + break; + case NODE_ACTION_ADD: + printf("add\n"); + break; + case NODE_ACTION_REPLACE: + printf("replace\n"); + break; + case NODE_ACTION_DELETE: + printf("delete\n"); + break; + default: + break; + } + if (copyfrom_rev != SVN_INVALID_REV) { + printf("Node-copyfrom-rev: %lu\n", copyfrom_rev); + printf("Node-copyfrom-path: %s\n", copyfrom_path); + } + printf("Prop-delta: false\n"); + printf("Prop-content-length: 10\n"); /* Constant 10 for "PROPS-END" */ + printf("Text-delta: false\n"); + printf("Text-content-length: %lu\n", text_len); + printf("Content-length: %lu\n\n", text_len + 10); + printf("PROPS-END\n\n"); +} + +void dump_export_text(struct line_buffer *data, off_t len) { + buffer_copy_bytes(data, len); +} diff --git a/vcs-svn/svnload.c b/vcs-svn/svnload.c new file mode 100644 index 0000000..7043ae7 --- /dev/null +++ b/vcs-svn/svnload.c @@ -0,0 +1,294 @@ +/* + * Produce a dumpfile v3 from a fast-import stream. + * Load the dump into the SVN repository with: + * svnrdump load <URL> <dumpfile + * + * Licensed under a two-clause BSD-style license. + * See LICENSE for details. + */ + +#include "cache.h" +#include "git-compat-util.h" +#include "line_buffer.h" +#include "dump_export.h" +#include "strbuf.h" + +#define SVN_DATE_FORMAT "%Y-%m-%dT%H:%M:%S.000000Z" +#define SVN_DATE_LEN 28 +#define LENGTH_UNKNOWN (~0) + +static struct line_buffer input = LINE_BUFFER_INIT; +static struct strbuf blobs[100]; + +static struct { + unsigned long prop_len, text_len, copyfrom_rev, mark; + int text_delta, prop_delta; /* Boolean */ + enum node_action action; + enum node_kind kind; + struct strbuf copyfrom_path, path; +} node_ctx; + +static struct { + int rev, text_len; + struct strbuf props, log; + struct strbuf svn_author, author, committer; + struct strbuf author_date, committer_date; + struct strbuf author_email, committer_email; +} rev_ctx; + +static enum { + UNKNOWN_CTX, + COMMIT_CTX, + BLOB_CTX +} active_ctx; + +static void reset_rev_ctx(int revision) +{ + rev_ctx.rev = revision; + strbuf_reset(&rev_ctx.props); + strbuf_reset(&rev_ctx.log); + strbuf_reset(&rev_ctx.svn_author); + strbuf_reset(&rev_ctx.author); + strbuf_reset(&rev_ctx.committer); + strbuf_reset(&rev_ctx.author_date); + strbuf_reset(&rev_ctx.committer_date); + strbuf_reset(&rev_ctx.author_email); + strbuf_reset(&rev_ctx.committer_email); +} + +static void reset_node_ctx(void) +{ + node_ctx.prop_len = LENGTH_UNKNOWN; + node_ctx.text_len = LENGTH_UNKNOWN; + node_ctx.mark = 0; + node_ctx.copyfrom_rev = 0; + node_ctx.text_delta = -1; + node_ctx.prop_delta = -1; + strbuf_reset(&node_ctx.copyfrom_path); + strbuf_reset(&node_ctx.path); +} + +static void populate_props(struct strbuf *props, const char *author, + const char *log, const char *date) { + strbuf_reset(props); + strbuf_addf(props, "K\nsvn:author\nV\n%s\n", author); + strbuf_addf(props, "K\nsvn:log\nV\n%s", log); + strbuf_addf(props, "K\nsvn:date\nV\n%s\n", date); + strbuf_add(props, "PROPS-END\n", 10); +} + +static void parse_author_line(char *val, struct strbuf *name, + struct strbuf *email, struct strbuf *date) { + char *t, *tz_off; + char time_buf[SVN_DATE_LEN]; + const struct tm *tm_time; + + /* Simon Hausmann <shausman@trolltech.com> 1170199019 +0100 */ + strbuf_reset(name); + strbuf_reset(email); + strbuf_reset(date); + tz_off = strrchr(val, ' '); + *tz_off++ = '\0'; + t = strrchr(val, ' '); + *(t - 1) = '\0'; /* Ignore '>' from email */ + t ++; + tm_time = time_to_tm(strtoul(t, NULL, 10), atoi(tz_off)); + strftime(time_buf, SVN_DATE_LEN, SVN_DATE_FORMAT, tm_time); + strbuf_add(date, time_buf, SVN_DATE_LEN); + t = strchr(val, '<'); + *(t - 1) = '\0'; /* Ignore ' <' from email */ + t ++; + strbuf_add(email, t, strlen(t)); + strbuf_add(name, val, strlen(val)); +} + +void svnload_read(void) { + char *t, *val; + int mode_incr; + struct strbuf *to_dump; + + while ((t = buffer_read_line(&input))) { + val = strchr(t, ' '); + if (!val) { + if (!memcmp(t, "blob", 4)) + active_ctx = BLOB_CTX; + else if (!memcmp(t, "deleteall", 9)) + ; + continue; + } + *val++ = '\0'; + + /* strlen(key) */ + switch (val - t - 1) { + case 1: + if (!memcmp(t, "D", 1)) { + node_ctx.action = NODE_ACTION_DELETE; + } + else if (!memcmp(t, "C", 1)) { + node_ctx.action = NODE_ACTION_ADD; + } + else if (!memcmp(t, "R", 1)) { + node_ctx.action = NODE_ACTION_REPLACE; + } + else if (!memcmp(t, "M", 1)) { + node_ctx.action = NODE_ACTION_CHANGE; + mode_incr = 7; + if (!memcmp(val, "100644", 6)) + node_ctx.kind = NODE_KIND_NORMAL; + else if (!memcmp(val, "100755", 6)) + node_ctx.kind = NODE_KIND_EXECUTABLE; + else if (!memcmp(val, "120000", 6)) + node_ctx.kind = NODE_KIND_SYMLINK; + else if (!memcmp(val, "160000", 6)) + node_ctx.kind = NODE_KIND_GITLINK; + else if (!memcmp(val, "040000", 6)) + node_ctx.kind = NODE_KIND_SUBDIR; + else { + if (!memcmp(val, "755", 3)) + node_ctx.kind = NODE_KIND_EXECUTABLE; + else if(!memcmp(val, "644", 3)) + node_ctx.kind = NODE_KIND_NORMAL; + else + die("Unrecognized mode: %s", val); + mode_incr = 4; + } + val += mode_incr; + t = strchr(val, ' '); + *t++ = '\0'; + strbuf_reset(&node_ctx.path); + strbuf_add(&node_ctx.path, t, strlen(t)); + if (!memcmp(val + 1, "inline", 6)) + die("Unsupported dataref: inline"); + else if (*val == ':') + to_dump = &blobs[strtoul(val + 1, NULL, 10)]; + else + die("Unsupported dataref: sha1"); + dump_export_node(node_ctx.path.buf, node_ctx.kind, + node_ctx.action, to_dump->len, + 0, NULL); + printf("%s", to_dump->buf); + } + break; + case 3: + if (!memcmp(t, "tag", 3)) + continue; + break; + case 4: + if (!memcmp(t, "mark", 4)) + switch(active_ctx) { + case COMMIT_CTX: + /* What do we do with commit marks? */ + continue; + case BLOB_CTX: + node_ctx.mark = strtoul(val + 1, NULL, 10); + break; + default: + break; + } + else if (!memcmp(t, "from", 4)) + continue; + else if (!memcmp(t, "data", 4)) { + switch (active_ctx) { + case COMMIT_CTX: + strbuf_reset(&rev_ctx.log); + buffer_read_binary(&input, + &rev_ctx.log, + strtoul(val, NULL, 10)); + populate_props(&rev_ctx.props, + rev_ctx.svn_author.buf, + rev_ctx.log.buf, + rev_ctx.author_date.buf); + dump_export_begin_rev(rev_ctx.rev, + rev_ctx.props.buf, + rev_ctx.props.len); + break; + case BLOB_CTX: + node_ctx.text_len = strtoul(val, NULL, 10); + buffer_read_binary(&input, + &blobs[node_ctx.mark], + node_ctx.text_len); + break; + default: + break; + } + } + break; + case 5: + if (!memcmp(t, "reset", 5)) + continue; + if (!memcmp(t, "merge", 5)) + continue; + break; + case 6: + if (!memcmp(t, "author", 6)) { + parse_author_line(val, &rev_ctx.author, + &rev_ctx.author_email, + &rev_ctx.author_date); + /* Build svn_author */ + t = strchr(rev_ctx.author_email.buf, '@'); + strbuf_reset(&rev_ctx.svn_author); + strbuf_add(&rev_ctx.svn_author, + rev_ctx.author_email.buf, + t - rev_ctx.author_email.buf); + + } + else if (!memcmp(t, "commit", 6)) { + rev_ctx.rev ++; + active_ctx = COMMIT_CTX; + } + break; + case 9: + if (!memcmp(t, "committer", 9)) + parse_author_line(val, &rev_ctx.committer, + &rev_ctx.committer_email, + &rev_ctx.committer_date); + break; + default: + break; + } + } +} + +int svnload_init(const char *filename) +{ + int i; + if (buffer_init(&input, filename)) + return error("cannot open %s: %s", filename, strerror(errno)); + active_ctx = UNKNOWN_CTX; + strbuf_init(&rev_ctx.props, MAX_GITSVN_LINE_LEN); + strbuf_init(&rev_ctx.log, MAX_GITSVN_LINE_LEN); + strbuf_init(&rev_ctx.author, MAX_GITSVN_LINE_LEN); + strbuf_init(&rev_ctx.committer, MAX_GITSVN_LINE_LEN); + strbuf_init(&rev_ctx.author_date, MAX_GITSVN_LINE_LEN); + strbuf_init(&rev_ctx.committer_date, MAX_GITSVN_LINE_LEN); + strbuf_init(&rev_ctx.author_email, MAX_GITSVN_LINE_LEN); + strbuf_init(&rev_ctx.committer_email, MAX_GITSVN_LINE_LEN); + strbuf_init(&node_ctx.path, MAX_GITSVN_LINE_LEN); + strbuf_init(&node_ctx.copyfrom_path, MAX_GITSVN_LINE_LEN); + for (i = 0; i < 100; i ++) + strbuf_init(&blobs[i], 10000); + return 0; +} + +void svnload_deinit(void) +{ + int i; + reset_rev_ctx(0); + reset_node_ctx(); + strbuf_release(&rev_ctx.props); + strbuf_release(&rev_ctx.log); + strbuf_release(&rev_ctx.author); + strbuf_release(&rev_ctx.committer); + strbuf_release(&rev_ctx.author_date); + strbuf_release(&rev_ctx.committer_date); + strbuf_release(&rev_ctx.author_email); + strbuf_release(&rev_ctx.committer_email); + strbuf_release(&node_ctx.path); + strbuf_release(&node_ctx.copyfrom_path); + for (i = 0; i < 100; i ++) + strbuf_release(&blobs[i]); + if (buffer_deinit(&input)) + fprintf(stderr, "Input error\n"); + if (ferror(stdout)) + fprintf(stderr, "Output error\n"); +} -- 1.7.4.rc1.7.g2cf08.dirty ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 2/3] vcs-svn: Start working on the dumpfile producer 2011-01-15 6:51 ` [PATCH 2/3] vcs-svn: Start working on the dumpfile producer Ramkumar Ramachandra @ 2011-01-15 7:39 ` Peter Baumann 2011-01-15 8:11 ` Ramkumar Ramachandra 0 siblings, 1 reply; 8+ messages in thread From: Peter Baumann @ 2011-01-15 7:39 UTC (permalink / raw) To: Ramkumar Ramachandra Cc: Git List, Jonathan Nieder, David Barr, Sverre Rabbelier On Sat, Jan 15, 2011 at 12:21:11PM +0530, Ramkumar Ramachandra wrote: > Start off with some broad design sketches. Compile succeeds, but > parser is incorrect. Include a Makefile rule to build it into > vcs-svn/lib.a. > > Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> > --- > Makefile | 2 +- > vcs-svn/dump_export.c | 73 ++++++++++++ > vcs-svn/svnload.c | 294 +++++++++++++++++++++++++++++++++++++++++++++++++ > 3 files changed, 368 insertions(+), 1 deletions(-) > create mode 100644 vcs-svn/dump_export.c > create mode 100644 vcs-svn/svnload.c > ... > diff --git a/vcs-svn/svnload.c b/vcs-svn/svnload.c > new file mode 100644 > index 0000000..7043ae7 > --- /dev/null > +++ b/vcs-svn/svnload.c > @@ -0,0 +1,294 @@ > +/* > + * Produce a dumpfile v3 from a fast-import stream. > + * Load the dump into the SVN repository with: > + * svnrdump load <URL> <dumpfile > + * > + * Licensed under a two-clause BSD-style license. > + * See LICENSE for details. > + */ > + > +#include "cache.h" > +#include "git-compat-util.h" > +#include "line_buffer.h" > +#include "dump_export.h" > +#include "strbuf.h" > + > +#define SVN_DATE_FORMAT "%Y-%m-%dT%H:%M:%S.000000Z" > +#define SVN_DATE_LEN 28 > +#define LENGTH_UNKNOWN (~0) > + > +static struct line_buffer input = LINE_BUFFER_INIT; > +static struct strbuf blobs[100]; > + > +static struct { > + unsigned long prop_len, text_len, copyfrom_rev, mark; > + int text_delta, prop_delta; /* Boolean */ > + enum node_action action; > + enum node_kind kind; > + struct strbuf copyfrom_path, path; > +} node_ctx; > + > +static struct { > + int rev, text_len; > + struct strbuf props, log; > + struct strbuf svn_author, author, committer; > + struct strbuf author_date, committer_date; > + struct strbuf author_email, committer_email; > +} rev_ctx; > + > +static enum { > + UNKNOWN_CTX, > + COMMIT_CTX, > + BLOB_CTX > +} active_ctx; > + > +static void reset_rev_ctx(int revision) > +{ > + rev_ctx.rev = revision; > + strbuf_reset(&rev_ctx.props); > + strbuf_reset(&rev_ctx.log); > + strbuf_reset(&rev_ctx.svn_author); > + strbuf_reset(&rev_ctx.author); > + strbuf_reset(&rev_ctx.committer); > + strbuf_reset(&rev_ctx.author_date); > + strbuf_reset(&rev_ctx.committer_date); > + strbuf_reset(&rev_ctx.author_email); > + strbuf_reset(&rev_ctx.committer_email); > +} > + > +static void reset_node_ctx(void) > +{ > + node_ctx.prop_len = LENGTH_UNKNOWN; > + node_ctx.text_len = LENGTH_UNKNOWN; > + node_ctx.mark = 0; > + node_ctx.copyfrom_rev = 0; > + node_ctx.text_delta = -1; > + node_ctx.prop_delta = -1; > + strbuf_reset(&node_ctx.copyfrom_path); > + strbuf_reset(&node_ctx.path); > +} > + > +static void populate_props(struct strbuf *props, const char *author, > + const char *log, const char *date) { > + strbuf_reset(props); > + strbuf_addf(props, "K\nsvn:author\nV\n%s\n", author); > + strbuf_addf(props, "K\nsvn:log\nV\n%s", log); > + strbuf_addf(props, "K\nsvn:date\nV\n%s\n", date); > + strbuf_add(props, "PROPS-END\n", 10); > +} > + > +static void parse_author_line(char *val, struct strbuf *name, > + struct strbuf *email, struct strbuf *date) { > + char *t, *tz_off; > + char time_buf[SVN_DATE_LEN]; > + const struct tm *tm_time; > + > + /* Simon Hausmann <shausman@trolltech.com> 1170199019 +0100 */ > + strbuf_reset(name); > + strbuf_reset(email); > + strbuf_reset(date); > + tz_off = strrchr(val, ' '); > + *tz_off++ = '\0'; > + t = strrchr(val, ' '); > + *(t - 1) = '\0'; /* Ignore '>' from email */ > + t ++; > + tm_time = time_to_tm(strtoul(t, NULL, 10), atoi(tz_off)); > + strftime(time_buf, SVN_DATE_LEN, SVN_DATE_FORMAT, tm_time); > + strbuf_add(date, time_buf, SVN_DATE_LEN); > + t = strchr(val, '<'); > + *(t - 1) = '\0'; /* Ignore ' <' from email */ > + t ++; > + strbuf_add(email, t, strlen(t)); > + strbuf_add(name, val, strlen(val)); > +} > + > +void svnload_read(void) { > + char *t, *val; > + int mode_incr; > + struct strbuf *to_dump; > + > + while ((t = buffer_read_line(&input))) { > + val = strchr(t, ' '); > + if (!val) { > + if (!memcmp(t, "blob", 4)) > + active_ctx = BLOB_CTX; > + else if (!memcmp(t, "deleteall", 9)) > + ; > + continue; Having actually no idea what the input you are reading from might look like, but seeing those two memcmp compares above makes me wonder if 't' might ever be smaller than 4 (or 9 for the else part). Which obviously would lead to a SEGFAULT. In the code below there are also memcmp class which might step out of the buffer. > + } > + *val++ = '\0'; > + > + /* strlen(key) */ > + switch (val - t - 1) { > + case 1: > + if (!memcmp(t, "D", 1)) { > + node_ctx.action = NODE_ACTION_DELETE; > + } > + else if (!memcmp(t, "C", 1)) { > + node_ctx.action = NODE_ACTION_ADD; > + } > + else if (!memcmp(t, "R", 1)) { > + node_ctx.action = NODE_ACTION_REPLACE; > + } > + else if (!memcmp(t, "M", 1)) { > + node_ctx.action = NODE_ACTION_CHANGE; > + mode_incr = 7; > + if (!memcmp(val, "100644", 6)) > + node_ctx.kind = NODE_KIND_NORMAL; > + else if (!memcmp(val, "100755", 6)) > + node_ctx.kind = NODE_KIND_EXECUTABLE; > + else if (!memcmp(val, "120000", 6)) > + node_ctx.kind = NODE_KIND_SYMLINK; > + else if (!memcmp(val, "160000", 6)) > + node_ctx.kind = NODE_KIND_GITLINK; > + else if (!memcmp(val, "040000", 6)) > + node_ctx.kind = NODE_KIND_SUBDIR; > + else { > + if (!memcmp(val, "755", 3)) > + node_ctx.kind = NODE_KIND_EXECUTABLE; > + else if(!memcmp(val, "644", 3)) > + node_ctx.kind = NODE_KIND_NORMAL; > + else > + die("Unrecognized mode: %s", val); > + mode_incr = 4; > + } > + val += mode_incr; > + t = strchr(val, ' '); > + *t++ = '\0'; > + strbuf_reset(&node_ctx.path); > + strbuf_add(&node_ctx.path, t, strlen(t)); > + if (!memcmp(val + 1, "inline", 6)) > + die("Unsupported dataref: inline"); > + else if (*val == ':') > + to_dump = &blobs[strtoul(val + 1, NULL, 10)]; > + else > + die("Unsupported dataref: sha1"); > + dump_export_node(node_ctx.path.buf, node_ctx.kind, > + node_ctx.action, to_dump->len, > + 0, NULL); > + printf("%s", to_dump->buf); > + } > + break; > + case 3: > + if (!memcmp(t, "tag", 3)) > + continue; > + break; > + case 4: > + if (!memcmp(t, "mark", 4)) > + switch(active_ctx) { > + case COMMIT_CTX: > + /* What do we do with commit marks? */ > + continue; > + case BLOB_CTX: > + node_ctx.mark = strtoul(val + 1, NULL, 10); > + break; > + default: > + break; > + } > + else if (!memcmp(t, "from", 4)) > + continue; > + else if (!memcmp(t, "data", 4)) { > + switch (active_ctx) { > + case COMMIT_CTX: > + strbuf_reset(&rev_ctx.log); > + buffer_read_binary(&input, > + &rev_ctx.log, > + strtoul(val, NULL, 10)); > + populate_props(&rev_ctx.props, > + rev_ctx.svn_author.buf, > + rev_ctx.log.buf, > + rev_ctx.author_date.buf); > + dump_export_begin_rev(rev_ctx.rev, > + rev_ctx.props.buf, > + rev_ctx.props.len); > + break; > + case BLOB_CTX: > + node_ctx.text_len = strtoul(val, NULL, 10); > + buffer_read_binary(&input, > + &blobs[node_ctx.mark], > + node_ctx.text_len); > + break; > + default: > + break; > + } > + } > + break; > + case 5: > + if (!memcmp(t, "reset", 5)) > + continue; > + if (!memcmp(t, "merge", 5)) > + continue; > + break; > + case 6: > + if (!memcmp(t, "author", 6)) { > + parse_author_line(val, &rev_ctx.author, > + &rev_ctx.author_email, > + &rev_ctx.author_date); > + /* Build svn_author */ > + t = strchr(rev_ctx.author_email.buf, '@'); > + strbuf_reset(&rev_ctx.svn_author); > + strbuf_add(&rev_ctx.svn_author, > + rev_ctx.author_email.buf, > + t - rev_ctx.author_email.buf); > + > + } > + else if (!memcmp(t, "commit", 6)) { > + rev_ctx.rev ++; > + active_ctx = COMMIT_CTX; > + } > + break; > + case 9: > + if (!memcmp(t, "committer", 9)) > + parse_author_line(val, &rev_ctx.committer, > + &rev_ctx.committer_email, > + &rev_ctx.committer_date); > + break; > + default: > + break; > + } > + } > +} > + > +int svnload_init(const char *filename) > +{ > + int i; > + if (buffer_init(&input, filename)) > + return error("cannot open %s: %s", filename, strerror(errno)); > + active_ctx = UNKNOWN_CTX; > + strbuf_init(&rev_ctx.props, MAX_GITSVN_LINE_LEN); > + strbuf_init(&rev_ctx.log, MAX_GITSVN_LINE_LEN); > + strbuf_init(&rev_ctx.author, MAX_GITSVN_LINE_LEN); > + strbuf_init(&rev_ctx.committer, MAX_GITSVN_LINE_LEN); > + strbuf_init(&rev_ctx.author_date, MAX_GITSVN_LINE_LEN); > + strbuf_init(&rev_ctx.committer_date, MAX_GITSVN_LINE_LEN); > + strbuf_init(&rev_ctx.author_email, MAX_GITSVN_LINE_LEN); > + strbuf_init(&rev_ctx.committer_email, MAX_GITSVN_LINE_LEN); > + strbuf_init(&node_ctx.path, MAX_GITSVN_LINE_LEN); > + strbuf_init(&node_ctx.copyfrom_path, MAX_GITSVN_LINE_LEN); > + for (i = 0; i < 100; i ++) > + strbuf_init(&blobs[i], 10000); > + return 0; > +} > + > +void svnload_deinit(void) > +{ > + int i; > + reset_rev_ctx(0); > + reset_node_ctx(); > + strbuf_release(&rev_ctx.props); > + strbuf_release(&rev_ctx.log); > + strbuf_release(&rev_ctx.author); > + strbuf_release(&rev_ctx.committer); > + strbuf_release(&rev_ctx.author_date); > + strbuf_release(&rev_ctx.committer_date); > + strbuf_release(&rev_ctx.author_email); > + strbuf_release(&rev_ctx.committer_email); > + strbuf_release(&node_ctx.path); > + strbuf_release(&node_ctx.copyfrom_path); > + for (i = 0; i < 100; i ++) > + strbuf_release(&blobs[i]); > + if (buffer_deinit(&input)) > + fprintf(stderr, "Input error\n"); > + if (ferror(stdout)) > + fprintf(stderr, "Output error\n"); > +} > -- > 1.7.4.rc1.7.g2cf08.dirty > > -- > To unsubscribe from this list: send the line "unsubscribe git" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/3] vcs-svn: Start working on the dumpfile producer 2011-01-15 7:39 ` Peter Baumann @ 2011-01-15 8:11 ` Ramkumar Ramachandra 0 siblings, 0 replies; 8+ messages in thread From: Ramkumar Ramachandra @ 2011-01-15 8:11 UTC (permalink / raw) To: Peter Baumann; +Cc: Git List, Jonathan Nieder, David Barr, Sverre Rabbelier Hi Peter, Peter Baumann writes: > > + while ((t = buffer_read_line(&input))) { > > + val = strchr(t, ' '); > > + if (!val) { > > + if (!memcmp(t, "blob", 4)) > > + active_ctx = BLOB_CTX; > > + else if (!memcmp(t, "deleteall", 9)) > > + ; > > + continue; > > Having actually no idea what the input you are reading from might look like, but > seeing those two memcmp compares above makes me wonder if 't' might ever be smaller > than 4 (or 9 for the else part). Which obviously would lead to a SEGFAULT. > In the code below there are also memcmp class which might step out of the > buffer. Right. Silly mistake on my part. Thanks for pointing it out. There are probably many more trivial mistakes- I was in a hurry to get /something/ working, and didn't have a chance to clean up the code. -- Ram ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 3/3] Build an svn-fi target in contrib/svn-fe 2011-01-15 6:51 [RFC PATCH 0/3] Towards a Git-to-SVN bridge Ramkumar Ramachandra 2011-01-15 6:51 ` [PATCH 1/3] date: Expose the time_to_tm function Ramkumar Ramachandra 2011-01-15 6:51 ` [PATCH 2/3] vcs-svn: Start working on the dumpfile producer Ramkumar Ramachandra @ 2011-01-15 6:51 ` Ramkumar Ramachandra 2011-01-15 7:22 ` [RFC PATCH 0/3] Towards a Git-to-SVN bridge Jonathan Nieder 3 siblings, 0 replies; 8+ messages in thread From: Ramkumar Ramachandra @ 2011-01-15 6:51 UTC (permalink / raw) To: Git List; +Cc: Jonathan Nieder, David Barr, Sverre Rabbelier Build an svn-fi target for testing the dumpfile producer in vcs-svn/. Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> --- contrib/svn-fe/Makefile | 23 +++++++++++++++++++++-- contrib/svn-fe/svn-fi.c | 16 ++++++++++++++++ contrib/svn-fe/svn-fi.txt | 28 ++++++++++++++++++++++++++++ 3 files changed, 65 insertions(+), 2 deletions(-) create mode 100644 contrib/svn-fe/svn-fi.c create mode 100644 contrib/svn-fe/svn-fi.txt diff --git a/contrib/svn-fe/Makefile b/contrib/svn-fe/Makefile index 360d8da..555a8ff 100644 --- a/contrib/svn-fe/Makefile +++ b/contrib/svn-fe/Makefile @@ -37,7 +37,7 @@ svn-fe$X: svn-fe.o $(VCSSVN_LIB) $(GIT_LIB) $(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ svn-fe.o \ $(ALL_LDFLAGS) $(LIBS) -svn-fe.o: svn-fe.c ../../vcs-svn/svndump.h +svn-fe.o: svn-fe.c ../../vcs-svn/svnload.h $(QUIET_CC)$(CC) -I../../vcs-svn -o $*.o -c $(ALL_CFLAGS) $< svn-fe.html: svn-fe.txt @@ -51,6 +51,24 @@ svn-fe.1: svn-fe.txt ../contrib/svn-fe/$@ $(MV) ../../Documentation/svn-fe.1 . +svn-fi$X: svn-fi.o $(VCSSVN_LIB) $(GIT_LIB) + $(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ svn-fi.o \ + $(ALL_LDFLAGS) $(LIBS) + +svn-fi.o: svn-fi.c ../../vcs-svn/svnload.h + $(QUIET_CC)$(CC) -I../../vcs-svn -o $*.o -c $(ALL_CFLAGS) $< + +svn-fi.html: svn-fi.txt + $(QUIET_SUBDIR0)../../Documentation $(QUIET_SUBDIR1) \ + MAN_TXT=../contrib/svn-fe/svn-fi.txt \ + ../contrib/svn-fe/$@ + +svn-fi.1: svn-fi.txt + $(QUIET_SUBDIR0)../../Documentation $(QUIET_SUBDIR1) \ + MAN_TXT=../contrib/svn-fe/svn-fi.txt \ + ../contrib/svn-fe/$@ + $(MV) ../../Documentation/svn-fi.1 . + ../../vcs-svn/lib.a: FORCE $(QUIET_SUBDIR0)../.. $(QUIET_SUBDIR1) vcs-svn/lib.a @@ -58,6 +76,7 @@ svn-fe.1: svn-fe.txt $(QUIET_SUBDIR0)../.. $(QUIET_SUBDIR1) libgit.a clean: - $(RM) svn-fe$X svn-fe.o svn-fe.html svn-fe.xml svn-fe.1 + $(RM) svn-fe$X svn-fe.o svn-fe.html svn-fe.xml svn-fe.1 \ + svn-fi$X svn-fi.o svn-fi.html svn-fi.xml svn-fi.1 .PHONY: all clean FORCE diff --git a/contrib/svn-fe/svn-fi.c b/contrib/svn-fe/svn-fi.c new file mode 100644 index 0000000..81347b0 --- /dev/null +++ b/contrib/svn-fe/svn-fi.c @@ -0,0 +1,16 @@ +/* + * This file is in the public domain. + * You may freely use, modify, distribute, and relicense it. + */ + +#include <stdlib.h> +#include "svnload.h" + +int main(int argc, char **argv) +{ + if (svnload_init(NULL)) + return 1; + svnload_read(); + svnload_deinit(); + return 0; +} diff --git a/contrib/svn-fe/svn-fi.txt b/contrib/svn-fe/svn-fi.txt new file mode 100644 index 0000000..996a175 --- /dev/null +++ b/contrib/svn-fe/svn-fi.txt @@ -0,0 +1,28 @@ +svn-fe(1) +========= + +NAME +---- +svn-fi - convert fast-import stream to an SVN "dumpfile" + +SYNOPSIS +-------- +[verse] +svn-fi + +DESCRIPTION +----------- + +Converts a git-fast-import(1) stream into a Subversion dumpfile. + +INPUT FORMAT +------------- +The fast-import format is documented by the git-fast-import(1) +manual page. + +OUTPUT FORMAT +------------ +Subversion's repository dump format is documented in full in +`notes/dump-load-format.txt` from the Subversion source tree. +Files in this format can be generated using the 'svnadmin dump' or +'svk admin dump' command. -- 1.7.4.rc1.7.g2cf08.dirty ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [RFC PATCH 0/3] Towards a Git-to-SVN bridge 2011-01-15 6:51 [RFC PATCH 0/3] Towards a Git-to-SVN bridge Ramkumar Ramachandra ` (2 preceding siblings ...) 2011-01-15 6:51 ` [PATCH 3/3] Build an svn-fi target in contrib/svn-fe Ramkumar Ramachandra @ 2011-01-15 7:22 ` Jonathan Nieder 2011-01-15 7:43 ` Ramkumar Ramachandra 3 siblings, 1 reply; 8+ messages in thread From: Jonathan Nieder @ 2011-01-15 7:22 UTC (permalink / raw) To: Ramkumar Ramachandra; +Cc: Git List, David Barr, Sverre Rabbelier Hi Ram, Ramkumar Ramachandra wrote: > Over the last couple of days, I've been working on a parser that > converts a fast-import stream into a SVN dumpfile. So far, it's very > rough and works minimally for some common fast-import > commands. Some early questions: - what are the design goals? Is this meant to be super fast? Robust? Simple? Why should I be excited about it?[1] - what subset of fast-import commands is supported? Is it well enough defined to make a manpage? - does this produce v2 or v3 dumpfiles? - why would I use this instead of git2svn? Does git2svn do anything this will not eventually be able to do? (Not a trick question --- I don't have enough experience with git2svn to tell its strengths and weaknesses.) > I've decided to try re-implementing fast-export > to eliminate blob marks Hopefully "re-implement" means "patch" here. :) I can comment on the code but it's probably better if I have a sense of the design first (in any event, thanks for sending it). Regards, Jonathan [1] I found the original svn-fe design interesting because (1) it reused code from an existing svndump parser, at least in spirit, (2) the repo_tree data structure was well fitted to the design constraints, (3) the line_buffer input abstraction was oddly satisfying, even though it does not buy anything obvious out of the box over direct use of strbuf and stdio; (4) speed; and, most importantly (5) the command-line interface was easy to debug, very flexible, and dead simple. I find the current svn-fe satisfying in a different way --- a sort of "line by line" translation between dump formats is becoming possible. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC PATCH 0/3] Towards a Git-to-SVN bridge 2011-01-15 7:22 ` [RFC PATCH 0/3] Towards a Git-to-SVN bridge Jonathan Nieder @ 2011-01-15 7:43 ` Ramkumar Ramachandra 0 siblings, 0 replies; 8+ messages in thread From: Ramkumar Ramachandra @ 2011-01-15 7:43 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Git List, David Barr, Sverre Rabbelier Hi Jonathan, Jonathan Nieder writes: > Ramkumar Ramachandra wrote: > > > Over the last couple of days, I've been working on a parser that > > converts a fast-import stream into a SVN dumpfile. So far, it's very > > rough and works minimally for some common fast-import > > commands. > > Some early questions: Thanks for raising these questions. People interested in the project should find this useful. > - what are the design goals? Is this meant to be super fast? > Robust? Simple? Why should I be excited about it?[1] I want it to be a lot like current svn-fe: as you can see, I've re-used many parsing ideas from it. It has to be atleast as fast as svnrdump, because I don't want it to bottleneck in the remote helper pipeline. It has to be simple because it'll give rise to other simple remote helpers- all the complexity has to be offloaded onto the lower layers like fast-import/ fast-export, and not onto the developer of the remote helper. > - what subset of fast-import commands is supported? Is it well > enough defined to make a manpage? Currently, it supports just "commit", "blob", "author", "committer" and "mark" that appear after a blob. It should support more commands soon enough- this implementation is just a proof of concept. Also, Instead of giving it the ability to parse /any/ valid fast-import stream, I want to simply focus on parsing the stream produced by git fast-export. That should explain why I'm trying to patch git fast-export primarily. > - does this produce v2 or v3 dumpfiles? This is one issue I haven't thought about fully yet. I'm currently thinking of generating a non-deltified dumpfile v3 -- something that svnrdump will accept. Generating deltas might be an unnecessary overhead- but as you pointed out yesterday, that clearly needs more thought. > - why would I use this instead of git2svn? Does git2svn do anything > this will not eventually be able to do? (Not a trick question --- > I don't have enough experience with git2svn to tell its strengths > and weaknesses.) git2svn persists blobs in-memory. It's written in Perl and it's slow. I thought we needed something nicer to be used with a remote helper, and started writing svn-fi. > > I've decided to try re-implementing fast-export > > to eliminate blob marks > > Hopefully "re-implement" means "patch" here. :) Yep. Just a big one :) > I can comment on the code but it's probably better if I have a sense > of the design first (in any event, thanks for sending it). I haven't had time to clean up the code. Note that it "just works" at the moment- yes, it's already very very fast :) However, I'm going to stall the branch and work on fast-export-inline for now. -- Ram ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2011-01-15 8:11 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-01-15 6:51 [RFC PATCH 0/3] Towards a Git-to-SVN bridge Ramkumar Ramachandra 2011-01-15 6:51 ` [PATCH 1/3] date: Expose the time_to_tm function Ramkumar Ramachandra 2011-01-15 6:51 ` [PATCH 2/3] vcs-svn: Start working on the dumpfile producer Ramkumar Ramachandra 2011-01-15 7:39 ` Peter Baumann 2011-01-15 8:11 ` Ramkumar Ramachandra 2011-01-15 6:51 ` [PATCH 3/3] Build an svn-fi target in contrib/svn-fe Ramkumar Ramachandra 2011-01-15 7:22 ` [RFC PATCH 0/3] Towards a Git-to-SVN bridge Jonathan Nieder 2011-01-15 7:43 ` Ramkumar Ramachandra
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).