From: "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Elijah Newren <newren@gmail.com>, Patrick Steinhardt <ps@pks.im>,
Johannes Schindelin <johannes.schindelin@gmx.de>,
Johannes Schindelin <johannes.schindelin@gmx.de>
Subject: [PATCH/RFC 4/5] test-tool: add a "historian" subcommand for building merge fixtures
Date: Wed, 06 May 2026 22:43:23 +0000 [thread overview]
Message-ID: <72c486312cde9a9fd2dedb60bc43c5c3e40a0d64.1778107405.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.2106.git.1778107405.gitgitgadget@gmail.com>
From: Johannes Schindelin <johannes.schindelin@gmx.de>
The merge-replay tests added in a follow-up commit need a way to set
up specific topologies with full control over blob contents, parent
order, and per-side trees. Sequencing plumbing commands or driving
plain `git fast-import` from shell quickly becomes unreadable for
the kinds of scenarios that exercise non-trivial merge resolution
(textual conflicts, semantic edits outside the conflict region,
intentional limitations such as new content on one side).
Add a small `test-tool historian` subcommand that reads a tight,
shell-quoted, one-line-per-object DSL and feeds an equivalent stream
to a `git fast-import` child process. Each blob and commit is given
a logical name; the helper allocates fast-import marks on first use
and emits a lightweight tag for every commit so tests can refer to
the resulting object via `refs/tags/<name>`.
The DSL has just two directives:
blob NAME LINE...
commit NAME BRANCH SUBJECT [from=NAME] [merge=NAME]... [PATH=BLOB]...
A blob's content is the listed lines joined with `\n` (and a final
`\n`); a commit's tree is exactly the listed PATH=BLOB pairs (the
helper emits a `deleteall` so nothing leaks in from the implicit
parent). Token splitting is delegated to `split_cmdline()` so quoted
arguments work as in shell. Marks for parent references and file
contents go through the same `strintmap`-backed name resolver, which
keeps the helper itself trivially small: blob writing, tree
construction, commit creation and merge-base computation are all
handled by `git fast-import`.
Note that the DSL reserves the names `from` and `merge` (with a
trailing `=`) for parent specification; a tree path called `from` or
`merge` cannot be expressed via this helper. That is acceptable here
because every input is a tightly controlled test fixture and the
filenames are chosen by the test author.
The helper trusts its caller: malformed input results in a
fast-import error rather than a friendly diagnostic.
Wire the new subcommand into the Makefile and meson build, register
it in `t/helper/test-tool.{c,h}`.
Assisted-by: Claude Opus 4.7
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
Makefile | 1 +
t/helper/meson.build | 1 +
t/helper/test-historian.c | 189 ++++++++++++++++++++++++++++++++++++++
t/helper/test-tool.c | 1 +
t/helper/test-tool.h | 1 +
5 files changed, 193 insertions(+)
create mode 100644 t/helper/test-historian.c
diff --git a/Makefile b/Makefile
index cedc234173..b38678b484 100644
--- a/Makefile
+++ b/Makefile
@@ -832,6 +832,7 @@ TEST_BUILTINS_OBJS += test-hash-speed.o
TEST_BUILTINS_OBJS += test-hash.o
TEST_BUILTINS_OBJS += test-hashmap.o
TEST_BUILTINS_OBJS += test-hexdump.o
+TEST_BUILTINS_OBJS += test-historian.o
TEST_BUILTINS_OBJS += test-json-writer.o
TEST_BUILTINS_OBJS += test-lazy-init-name-hash.o
TEST_BUILTINS_OBJS += test-match-trees.o
diff --git a/t/helper/meson.build b/t/helper/meson.build
index 675e64c010..704edd1e1f 100644
--- a/t/helper/meson.build
+++ b/t/helper/meson.build
@@ -29,6 +29,7 @@ test_tool_sources = [
'test-hash.c',
'test-hashmap.c',
'test-hexdump.c',
+ 'test-historian.c',
'test-json-writer.c',
'test-lazy-init-name-hash.c',
'test-match-trees.c',
diff --git a/t/helper/test-historian.c b/t/helper/test-historian.c
new file mode 100644
index 0000000000..2250d420c0
--- /dev/null
+++ b/t/helper/test-historian.c
@@ -0,0 +1,189 @@
+/*
+ * Build a small history out of a tiny declarative input. Used by tests
+ * that need specific merge topologies without long sequences of
+ * plumbing commands or fragile shell helpers.
+ *
+ * The historian reads stdin line by line and emits an equivalent
+ * stream to a `git fast-import` child process. It also allocates marks
+ * for named objects so tests can refer to commits and blobs by name.
+ *
+ * Input directives (one per line, shell-style quoting):
+ *
+ * blob NAME LINE1 LINE2 ...
+ * Each LINE becomes a content line in the blob; lines are
+ * joined with '\n' and the blob ends with a final '\n'. With
+ * no LINEs, the blob is empty.
+ *
+ * commit NAME BRANCH SUBJECT [from=PARENT] [merge=PARENT]... [PATH=BLOB]...
+ * Creates a commit on refs/heads/BRANCH using the listed
+ * file=blob mappings as the entire tree (no inheritance from
+ * parents). Up to one `from=` and any number of `merge=`
+ * parents may be given. `from=` defaults to the current branch
+ * tip; if BRANCH has no tip yet, the commit becomes a root.
+ *
+ * Each `commit NAME` directive also creates a lightweight tag
+ * `refs/tags/NAME` so tests can `git rev-parse NAME`.
+ *
+ * This helper trusts its caller; malformed input results in fast-import
+ * errors. That is fine because test scripts feed it tightly controlled
+ * input.
+ */
+
+#define USE_THE_REPOSITORY_VARIABLE
+
+#include "test-tool.h"
+#include "git-compat-util.h"
+#include "alias.h"
+#include "run-command.h"
+#include "setup.h"
+#include "strbuf.h"
+#include "strmap.h"
+#include "strvec.h"
+
+static int next_mark = 1;
+
+static int resolve_mark(struct strintmap *names, const char *name)
+{
+ int n = strintmap_get(names, name);
+ if (!n) {
+ n = next_mark++;
+ strintmap_set(names, name, n);
+ }
+ return n;
+}
+
+static void emit_data(FILE *out, const char *data, size_t len)
+{
+ fprintf(out, "data %"PRIuMAX"\n", (uintmax_t)len);
+ fwrite(data, 1, len, out);
+ fputc('\n', out);
+}
+
+static void emit_blob(FILE *out, struct strintmap *names,
+ int argc, const char **argv)
+{
+ struct strbuf content = STRBUF_INIT;
+ int n = resolve_mark(names, argv[1]);
+ int i;
+
+ for (i = 2; i < argc; i++) {
+ strbuf_addstr(&content, argv[i]);
+ strbuf_addch(&content, '\n');
+ }
+
+ fprintf(out, "blob\nmark :%d\n", n);
+ emit_data(out, content.buf, content.len);
+ strbuf_release(&content);
+}
+
+static void emit_tag(FILE *out, const char *name, int mark)
+{
+ fprintf(out, "reset refs/tags/%s\nfrom :%d\n\n", name, mark);
+}
+
+static void emit_commit(FILE *out, struct strintmap *names,
+ int argc, const char **argv, int seq)
+{
+ int n = resolve_mark(names, argv[1]);
+ const char *branch = argv[2];
+ const char *subject = argv[3];
+ const char *rest;
+ int i;
+
+ fprintf(out, "commit refs/heads/%s\nmark :%d\n", branch, n);
+ fprintf(out, "author A <a@e> %d +0000\n", 1700000000 + seq);
+ fprintf(out, "committer A <a@e> %d +0000\n", 1700000000 + seq);
+ emit_data(out, subject, strlen(subject));
+
+ /*
+ * fast-import requires `from` and `merge` to precede all file
+ * operations; emit them first regardless of argv ordering.
+ */
+ for (i = 4; i < argc; i++) {
+ if (skip_prefix(argv[i], "from=", &rest))
+ fprintf(out, "from :%d\n", resolve_mark(names, rest));
+ else if (skip_prefix(argv[i], "merge=", &rest))
+ fprintf(out, "merge :%d\n", resolve_mark(names, rest));
+ }
+
+ /*
+ * The PATH=BLOB list is the entire tree; wipe whatever the
+ * implicit parent contributed before re-applying it.
+ */
+ fprintf(out, "deleteall\n");
+ for (i = 4; i < argc; i++) {
+ const char *eq;
+ size_t key_len;
+ char *path;
+
+ if (skip_prefix(argv[i], "from=", &rest) ||
+ skip_prefix(argv[i], "merge=", &rest))
+ continue;
+ eq = strchr(argv[i], '=');
+ if (!eq)
+ die("bad commit spec '%s'", argv[i]);
+ key_len = eq - argv[i];
+ path = xmemdupz(argv[i], key_len);
+ fprintf(out, "M 100644 :%d %s\n",
+ resolve_mark(names, eq + 1), path);
+ free(path);
+ }
+
+ fputc('\n', out);
+ emit_tag(out, argv[1], n);
+}
+
+int cmd__historian(int argc, const char **argv UNUSED)
+{
+ struct child_process fi = CHILD_PROCESS_INIT;
+ struct strintmap names = STRINTMAP_INIT;
+ struct strbuf line = STRBUF_INIT;
+ int seq = 0;
+ int ret = 0;
+ FILE *fi_in;
+
+ if (argc != 1)
+ die("usage: test-tool historian <input");
+
+ setup_git_directory();
+
+ strvec_pushl(&fi.args, "fast-import", "--quiet", "--force", NULL);
+ fi.git_cmd = 1;
+ fi.in = -1;
+ fi.no_stdout = 1;
+ if (start_command(&fi))
+ die("failed to start git fast-import");
+ fi_in = xfdopen(fi.in, "w");
+
+ while (strbuf_getline_lf(&line, stdin) != EOF) {
+ const char **a = NULL;
+ int n;
+
+ strbuf_trim(&line);
+ if (!line.len || line.buf[0] == '#')
+ continue;
+
+ n = split_cmdline(line.buf, &a);
+ if (n < 0)
+ die("split_cmdline failed: %s",
+ split_cmdline_strerror(n));
+
+ if (n >= 2 && !strcmp(a[0], "blob"))
+ emit_blob(fi_in, &names, n, a);
+ else if (n >= 4 && !strcmp(a[0], "commit"))
+ emit_commit(fi_in, &names, n, a, seq++);
+ else
+ die("unknown directive: %s", a[0]);
+
+ free(a);
+ }
+
+ if (fclose(fi_in))
+ die_errno("close fast-import stdin");
+ if (finish_command(&fi))
+ ret = 1;
+
+ strbuf_release(&line);
+ strintmap_clear(&names);
+ return ret;
+}
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index a7abc618b3..28bde98ce1 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -39,6 +39,7 @@ static struct test_cmd cmds[] = {
{ "hashmap", cmd__hashmap },
{ "hash-speed", cmd__hash_speed },
{ "hexdump", cmd__hexdump },
+ { "historian", cmd__historian },
{ "json-writer", cmd__json_writer },
{ "lazy-init-name-hash", cmd__lazy_init_name_hash },
{ "match-trees", cmd__match_trees },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index 7f150fa1eb..78cec8594a 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -32,6 +32,7 @@ int cmd__getcwd(int argc, const char **argv);
int cmd__hashmap(int argc, const char **argv);
int cmd__hash_speed(int argc, const char **argv);
int cmd__hexdump(int argc, const char **argv);
+int cmd__historian(int argc, const char **argv);
int cmd__json_writer(int argc, const char **argv);
int cmd__lazy_init_name_hash(int argc, const char **argv);
int cmd__match_trees(int argc, const char **argv);
--
gitgitgadget
next prev parent reply other threads:[~2026-05-06 22:43 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-06 22:43 [PATCH/RFC 0/5] replay: support replaying 2-parent merges Johannes Schindelin via GitGitGadget
2026-05-06 22:43 ` [PATCH/RFC 1/5] " Johannes Schindelin via GitGitGadget
2026-05-08 9:36 ` Phillip Wood
2026-05-08 10:05 ` Phillip Wood
2026-05-06 22:43 ` [PATCH/RFC 2/5] replay: short-circuit merge replay when parent and base trees are unchanged Johannes Schindelin via GitGitGadget
2026-05-06 22:43 ` [PATCH/RFC 3/5] history.adoc: describe merge-replay support and its limits Johannes Schindelin via GitGitGadget
2026-05-06 22:43 ` Johannes Schindelin via GitGitGadget [this message]
2026-05-12 10:54 ` [PATCH/RFC 4/5] test-tool: add a "historian" subcommand for building merge fixtures Toon Claes
2026-05-06 22:43 ` [PATCH/RFC 5/5] t3454: cover merge-replay scenarios with the historian helper Johannes Schindelin via GitGitGadget
2026-05-07 14:14 ` [PATCH/RFC 0/5] replay: support replaying 2-parent merges D. Ben Knoble
2026-05-07 15:06 ` Johannes Schindelin
2026-05-07 15:39 ` Ben Knoble
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=72c486312cde9a9fd2dedb60bc43c5c3e40a0d64.1778107405.git.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=git@vger.kernel.org \
--cc=johannes.schindelin@gmx.de \
--cc=newren@gmail.com \
--cc=ps@pks.im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox