From: Elijah Newren <newren@gmail.com>
To: git@vger.kernel.org
Cc: gitster@pobox.com, johannes.schindelin@gmx.de,
Elijah Newren <newren@gmail.com>
Subject: [PATCH 5/5] fast-export: do automatic reencoding of commit messages only if requested
Date: Thu, 25 Apr 2019 08:51:18 -0700 [thread overview]
Message-ID: <20190425155118.7918-6-newren@gmail.com> (raw)
In-Reply-To: <20190425155118.7918-1-newren@gmail.com>
Automatic re-encoding of commit messages (and dropping of the encoding
header) hurts attempts to do reversible history rewrites (e.g. sha1sum
<-> sha256sum transitions, some subtree rewrites), and seems
inconsistent with the general principle followed elsewhere in
fast-export of requiring explicit user requests to modify the output
(e.g. --signed-tags=strip, --tag-of-filtered-object=rewrite). Add a
--reencode flag that the user can use to specify, and like other
fast-export flags, default it to 'abort'.
Signed-off-by: Elijah Newren <newren@gmail.com>
---
builtin/fast-export.c | 35 ++++++++++++++++++++++++++++++++---
t/t9350-fast-export.sh | 31 ++++++++++++++++++++++++++++---
2 files changed, 60 insertions(+), 6 deletions(-)
diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index 66331fa401..43cc52331c 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -33,6 +33,7 @@ static const char *fast_export_usage[] = {
static int progress;
static enum { SIGNED_TAG_ABORT, VERBATIM, WARN, WARN_STRIP, STRIP } signed_tag_mode = SIGNED_TAG_ABORT;
static enum { TAG_FILTERING_ABORT, DROP, REWRITE } tag_of_filtered_mode = TAG_FILTERING_ABORT;
+static enum { REENCODE_ABORT, REENCODE_PLEASE, REENCODE_NEVER } reencode_mode = REENCODE_ABORT;
static int fake_missing_tagger;
static int use_done_feature;
static int no_data;
@@ -77,6 +78,20 @@ static int parse_opt_tag_of_filtered_mode(const struct option *opt,
return 0;
}
+static int parse_opt_reencode_mode(const struct option *opt,
+ const char *arg, int unset)
+{
+ if (unset || !strcmp(arg, "abort"))
+ reencode_mode = REENCODE_ABORT;
+ else if (!strcmp(arg, "yes"))
+ reencode_mode = REENCODE_PLEASE;
+ else if (!strcmp(arg, "no"))
+ reencode_mode = REENCODE_NEVER;
+ else
+ return error("Unknown reencoding mode: %s", arg);
+ return 0;
+}
+
static struct decoration idnums;
static uint32_t last_idnum;
@@ -633,10 +648,21 @@ static void handle_commit(struct commit *commit, struct rev_info *rev,
}
mark_next_object(&commit->object);
- if (anonymize)
+ if (anonymize) {
reencoded = anonymize_commit_message(message);
- else if (!is_encoding_utf8(encoding))
- reencoded = reencode_string(message, "UTF-8", encoding);
+ } else if (encoding) {
+ switch(reencode_mode) {
+ case REENCODE_PLEASE:
+ reencoded = reencode_string(message, "UTF-8", encoding);
+ break;
+ case REENCODE_NEVER:
+ break;
+ case REENCODE_ABORT:
+ die("Encountered commit-specific encoding %s in commit "
+ "%s; use --reencode=<mode> to handle it",
+ encoding, oid_to_hex(&commit->object.oid));
+ }
+ }
if (!commit->parents)
printf("reset %s\n", refname);
printf("commit %s\nmark :%"PRIu32"\n", refname, last_idnum);
@@ -1091,6 +1117,9 @@ int cmd_fast_export(int argc, const char **argv, const char *prefix)
OPT_CALLBACK(0, "tag-of-filtered-object", &tag_of_filtered_mode, N_("mode"),
N_("select handling of tags that tag filtered objects"),
parse_opt_tag_of_filtered_mode),
+ OPT_CALLBACK(0, "reencode", &reencode_mode, N_("mode"),
+ N_("select handling of commit messages in an alternate encoding"),
+ parse_opt_reencode_mode),
OPT_STRING(0, "export-marks", &export_filename, N_("file"),
N_("Dump marks to this file")),
OPT_STRING(0, "import-marks", &import_filename, N_("file"),
diff --git a/t/t9350-fast-export.sh b/t/t9350-fast-export.sh
index 975c8c4014..4774926bb6 100755
--- a/t/t9350-fast-export.sh
+++ b/t/t9350-fast-export.sh
@@ -94,7 +94,7 @@ test_expect_success 'fast-export --show-original-ids | git fast-import' '
test $MUSS = $(git rev-parse --verify refs/tags/muss)
'
-test_expect_success 'iso-8859-7' '
+test_expect_success 'reencoding iso-8859-7' '
test_when_finished "git reset --hard HEAD~1" &&
test_when_finished "git config --unset i18n.commitencoding" &&
@@ -102,7 +102,7 @@ test_expect_success 'iso-8859-7' '
test_tick &&
echo rosten >file &&
git commit -s -m "$(printf "Pi: \360")" file &&
- git fast-export wer^..wer >iso-8859-7.fi &&
+ git fast-export --reencode=yes wer^..wer >iso-8859-7.fi &&
sed "s/wer/i18n/" iso-8859-7.fi |
(cd new &&
git fast-import &&
@@ -110,6 +110,31 @@ test_expect_success 'iso-8859-7' '
grep $(printf "\317\200") actual)
'
+test_expect_success 'aborting on iso-8859-7' '
+
+ test_when_finished "git reset --hard HEAD~1" &&
+ test_when_finished "git config --unset i18n.commitencoding" &&
+ git config i18n.commitencoding iso-8859-7 &&
+ echo rosten >file &&
+ git commit -s -m "$(printf "Pi: \360")" file &&
+ test_must_fail git fast-export --reencode=abort wer^..wer >iso-8859-7.fi
+'
+
+test_expect_success 'preserving iso-8859-7' '
+
+ test_when_finished "git reset --hard HEAD~1" &&
+ test_when_finished "git config --unset i18n.commitencoding" &&
+ git config i18n.commitencoding iso-8859-7 &&
+ echo rosten >file &&
+ git commit -s -m "$(printf "Pi: \360")" file &&
+ git fast-export --reencode=no wer^..wer >iso-8859-7.fi &&
+ sed "s/wer/i18n-no-recoding/" iso-8859-7.fi |
+ (cd new &&
+ git fast-import &&
+ git cat-file commit i18n-no-recoding >actual &&
+ grep $(printf "\360") actual)
+'
+
test_expect_success 'encoding preserved if reencoding fails' '
test_when_finished "git reset --hard HEAD~1" &&
@@ -117,7 +142,7 @@ test_expect_success 'encoding preserved if reencoding fails' '
git config i18n.commitencoding iso-8859-7 &&
echo rosten >file &&
git commit -s -m "$(printf "Pi: \360; Invalid: \377")" file &&
- git fast-export wer^..wer >iso-8859-7.fi &&
+ git fast-export --reencode=yes wer^..wer >iso-8859-7.fi &&
sed "s/wer/i18n-invalid/" iso-8859-7.fi |
(cd new &&
git fast-import &&
--
2.21.0.779.g2f4b9c5032
next prev parent reply other threads:[~2019-04-25 15:51 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-25 15:51 [PATCH 0/5] Fix and extend encoding handling in fast export/import Elijah Newren
2019-04-25 15:51 ` [PATCH 1/5] t9350: fix encoding test to actually test reencoding Elijah Newren
2019-04-25 17:52 ` Eric Sunshine
2019-04-26 2:40 ` Junio C Hamano
2019-04-25 15:51 ` [PATCH 2/5] fast-import: support 'encoding' commit header Elijah Newren
2019-04-25 19:36 ` Eric Sunshine
2019-04-26 11:39 ` Elijah Newren
2019-04-25 15:51 ` [PATCH 3/5] fast-export: avoid stripping encoding header if we cannot reencode Elijah Newren
2019-04-25 15:51 ` [PATCH 4/5] fast-export: differentiate between explicitly utf-8 and implicitly utf-8 Elijah Newren
2019-04-25 15:51 ` Elijah Newren [this message]
2019-04-25 15:55 ` [PATCH 0/5] Fix and extend encoding handling in fast export/import Elijah Newren
2019-04-25 15:57 ` Elijah Newren
2019-04-26 21:32 ` brian m. carlson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190425155118.7918-6-newren@gmail.com \
--to=newren@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=johannes.schindelin@gmx.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.