From: Elijah Newren <newren@gmail.com>
To: gitster@pobox.com
Cc: git@vger.kernel.org, Eric Sunshine <sunshine@sunshineco.com>,
Elijah Newren <newren@gmail.com>
Subject: [PATCH v2 0/5] Fix and extend encoding handling in fast export/import
Date: Tue, 30 Apr 2019 11:25:18 -0700 [thread overview]
Message-ID: <20190430182523.3339-1-newren@gmail.com> (raw)
While stress testing `git filter-repo`, I noticed an issue with
encoding; further digging led to the fixes and features in this series.
See the individual commit messages for details.
Changes since v1 (full range-diff below):
* Applied style fixes Eric pointed out in his review (thanks!)
* Rebased on latest master (83232e38, "The seventh batch"), resolving
a trivial merge conflict. Now merges cleanly with next and pu as
well.
I'm a bit under the weather so I may be slow to respond...
Elijah Newren (5):
t9350: fix encoding test to actually test reencoding
fast-import: support 'encoding' commit header
fast-export: avoid stripping encoding header if we cannot reencode
fast-export: differentiate between explicitly utf-8 and implicitly
utf-8
fast-export: do automatic reencoding of commit messages only if
requested
Documentation/git-fast-import.txt | 7 ++++
builtin/fast-export.c | 44 +++++++++++++++++++++----
fast-import.c | 11 +++++--
t/t9300-fast-import.sh | 20 ++++++++++++
t/t9350-fast-export.sh | 53 +++++++++++++++++++++++++------
5 files changed, 118 insertions(+), 17 deletions(-)
Range-diff:
1: d6efd05142 ! 1: 9cc04242bd t9350: fix encoding test to actually test reencoding
@@ -26,8 +26,7 @@
- # use author and committer name in ISO-8859-1 to match it.
- . "$TEST_DIRECTORY"/t3901/8859-1.txt &&
+ test_when_finished "git reset --hard HEAD~1" &&
-+ test_when_finished "git config --unset i18n.commitencoding" &&
-+ git config i18n.commitencoding iso-8859-7 &&
++ test_config i18n.commitencoding iso-8859-7 &&
test_tick &&
echo rosten >file &&
- git commit -s -m den file &&
2: 02f48c7559 ! 2: 0cd023ac7a fast-import: support 'encoding' commit header
@@ -51,9 +51,8 @@
}
if (!committer)
die("Expected committer but didn't get one");
-+ if (skip_prefix(command_buf.buf, "encoding ", &encoding)) {
++ if (skip_prefix(command_buf.buf, "encoding ", &encoding))
+ read_next_command();
-+ }
parse_data(&msg, 0, NULL);
read_next_command();
parse_from(b);
@@ -69,7 +68,7 @@
+ strbuf_addf(&new_data,
+ "encoding %s\n",
+ encoding);
-+ strbuf_addf(&new_data, "\n");
++ strbuf_addch(&new_data, '\n');
strbuf_addbuf(&new_data, &msg);
free(author);
free(committer);
@@ -78,14 +77,14 @@
--- a/t/t9300-fast-import.sh
+++ b/t/t9300-fast-import.sh
@@
- background_import_still_running
+ sed -e s/LFs/LLL/ W-input | tr L "\n" | test_must_fail git fast-import
'
+###
-+### series W (other new features)
++### series X (other new features)
+###
+
-+test_expect_success 'W: handling encoding' '
++test_expect_success 'X: handling encoding' '
+ test_tick &&
+ cat >input <<-INPUT_END &&
+ commit refs/heads/encoding
3: 86c348402d ! 3: 1fddf51402 fast-export: avoid stripping encoding header if we cannot reencode
@@ -41,8 +41,7 @@
+test_expect_success 'encoding preserved if reencoding fails' '
+
+ test_when_finished "git reset --hard HEAD~1" &&
-+ test_when_finished "git config --unset i18n.commitencoding" &&
-+ git config i18n.commitencoding iso-8859-7 &&
++ test_config i18n.commitencoding iso-8859-7 &&
+ echo rosten >file &&
+ git commit -s -m "$(printf "Pi: \360; Invalid: \377")" file &&
+ git fast-export wer^..wer >iso-8859-7.fi &&
4: c09b23bc59 = 4: 4a2e04b3ae fast-export: differentiate between explicitly utf-8 and implicitly utf-8
5: 24b69a0db9 ! 5: 44aacb1a0b fast-export: do automatic reencoding of commit messages only if requested
@@ -92,8 +92,7 @@
+test_expect_success 'reencoding iso-8859-7' '
test_when_finished "git reset --hard HEAD~1" &&
- test_when_finished "git config --unset i18n.commitencoding" &&
-@@
+ test_config i18n.commitencoding iso-8859-7 &&
test_tick &&
echo rosten >file &&
git commit -s -m "$(printf "Pi: \360")" file &&
@@ -109,8 +108,7 @@
+test_expect_success 'aborting on iso-8859-7' '
+
+ test_when_finished "git reset --hard HEAD~1" &&
-+ test_when_finished "git config --unset i18n.commitencoding" &&
-+ git config i18n.commitencoding iso-8859-7 &&
++ test_config i18n.commitencoding iso-8859-7 &&
+ echo rosten >file &&
+ git commit -s -m "$(printf "Pi: \360")" file &&
+ test_must_fail git fast-export --reencode=abort wer^..wer >iso-8859-7.fi
@@ -119,8 +117,7 @@
+test_expect_success 'preserving iso-8859-7' '
+
+ test_when_finished "git reset --hard HEAD~1" &&
-+ test_when_finished "git config --unset i18n.commitencoding" &&
-+ git config i18n.commitencoding iso-8859-7 &&
++ test_config i18n.commitencoding iso-8859-7 &&
+ echo rosten >file &&
+ git commit -s -m "$(printf "Pi: \360")" file &&
+ git fast-export --reencode=no wer^..wer >iso-8859-7.fi &&
@@ -134,8 +131,7 @@
test_expect_success 'encoding preserved if reencoding fails' '
test_when_finished "git reset --hard HEAD~1" &&
-@@
- git config i18n.commitencoding iso-8859-7 &&
+ test_config i18n.commitencoding iso-8859-7 &&
echo rosten >file &&
git commit -s -m "$(printf "Pi: \360; Invalid: \377")" file &&
- git fast-export wer^..wer >iso-8859-7.fi &&
--
2.21.0.782.g44aacb1a0b
next reply other threads:[~2019-04-30 18:25 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-30 18:25 Elijah Newren [this message]
2019-04-30 18:25 ` [PATCH v2 1/5] t9350: fix encoding test to actually test reencoding Elijah Newren
2019-04-30 18:25 ` [PATCH v2 2/5] fast-import: support 'encoding' commit header Elijah Newren
2019-04-30 18:25 ` [PATCH v2 3/5] fast-export: avoid stripping encoding header if we cannot reencode Elijah Newren
2019-04-30 18:25 ` [PATCH v2 4/5] fast-export: differentiate between explicitly utf-8 and implicitly utf-8 Elijah Newren
2019-04-30 18:25 ` [PATCH v2 5/5] fast-export: do automatic reencoding of commit messages only if requested Elijah Newren
2019-05-10 20:53 ` [PATCH v3 0/5] Fix and extend encoding handling in fast export/import Elijah Newren
2019-05-10 20:53 ` [PATCH v3 1/5] t9350: fix encoding test to actually test reencoding Elijah Newren
2019-05-10 20:53 ` [PATCH v3 2/5] fast-import: support 'encoding' commit header Elijah Newren
2019-05-10 20:53 ` [PATCH v3 3/5] fast-export: avoid stripping encoding header if we cannot reencode Elijah Newren
2019-05-10 20:53 ` [PATCH v3 4/5] fast-export: differentiate between explicitly utf-8 and implicitly utf-8 Elijah Newren
2019-05-10 20:53 ` [PATCH v3 5/5] fast-export: do automatic reencoding of commit messages only if requested Elijah Newren
2019-05-11 21:07 ` Torsten Bögershausen
2019-05-11 21:42 ` Elijah Newren
2019-05-13 7:48 ` Junio C Hamano
2019-05-13 13:24 ` Elijah Newren
2019-05-13 10:23 ` Johannes Schindelin
2019-05-13 12:56 ` Torsten Bögershausen
2019-05-13 13:29 ` Elijah Newren
2019-05-13 16:41 ` Elijah Newren
2019-05-13 10:14 ` [PATCH v3 0/5] Fix and extend encoding handling in fast export/import Johannes Schindelin
2019-05-13 16:47 ` [PATCH v4 " Elijah Newren
2019-05-13 16:47 ` [PATCH v4 1/5] t9350: fix encoding test to actually test reencoding Elijah Newren
2019-05-13 16:47 ` [PATCH v4 2/5] fast-import: support 'encoding' commit header Elijah Newren
2019-05-13 16:47 ` [PATCH v4 3/5] fast-export: avoid stripping encoding header if we cannot reencode Elijah Newren
2019-05-13 16:47 ` [PATCH v4 4/5] fast-export: differentiate between explicitly utf-8 and implicitly utf-8 Elijah Newren
2019-05-13 16:47 ` [PATCH v4 5/5] fast-export: do automatic reencoding of commit messages only if requested Elijah Newren
2019-05-13 22:32 ` Junio C Hamano
2019-05-13 23:17 ` [PATCH v5 0/5] Fix and extend encoding handling in fast export/import Elijah Newren
2019-05-13 23:17 ` [PATCH v5 1/5] t9350: fix encoding test to actually test reencoding Elijah Newren
2019-05-14 2:50 ` Torsten Bögershausen
2019-05-13 23:17 ` [PATCH v5 2/5] fast-import: support 'encoding' commit header Elijah Newren
2019-05-13 23:17 ` [PATCH v5 3/5] fast-export: avoid stripping encoding header if we cannot reencode Elijah Newren
2019-05-14 2:56 ` Torsten Bögershausen
2019-05-13 23:17 ` [PATCH v5 4/5] fast-export: differentiate between explicitly utf-8 and implicitly utf-8 Elijah Newren
2019-05-14 3:01 ` Torsten Bögershausen
2019-05-13 23:17 ` [PATCH v5 5/5] fast-export: do automatic reencoding of commit messages only if requested Elijah Newren
2019-05-14 0:19 ` Eric Sunshine
2019-05-14 4:30 ` [PATCH v6 0/5] Fix and extend encoding handling in fast export/import Elijah Newren
2019-05-14 4:30 ` [PATCH v6 1/5] t9350: fix encoding test to actually test reencoding Elijah Newren
2019-05-14 4:30 ` [PATCH v6 2/5] fast-import: support 'encoding' commit header Elijah Newren
2019-05-14 4:31 ` [PATCH v6 3/5] fast-export: avoid stripping encoding header if we cannot reencode Elijah Newren
2019-05-14 4:31 ` [PATCH v6 4/5] fast-export: differentiate between explicitly UTF-8 and implicitly UTF-8 Elijah Newren
2019-05-14 4:31 ` [PATCH v6 5/5] fast-export: do automatic reencoding of commit messages only if requested Elijah Newren
2019-05-16 18:15 ` [PATCH v6 0/5] Fix and extend encoding handling in fast export/import Torsten Bögershausen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190430182523.3339-1-newren@gmail.com \
--to=newren@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=sunshine@sunshineco.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.