From: "Uwe Kleine-König" <u.kleine-koenig@pengutronix.de>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, Jiri Kosina <jkosina@suse.cz>
Subject: Re: [PATCH] shortlog: respect commit encoding
Date: Wed, 25 Nov 2009 15:00:52 +0100 [thread overview]
Message-ID: <20091125140052.GA5565@pengutronix.de> (raw)
In-Reply-To: <7vfx8376hd.fsf@alter.siamese.dyndns.org>
Hello Junio,
On Tue, Nov 24, 2009 at 05:12:14PM -0800, Junio C Hamano wrote:
> Uwe Kleine-König <u.kleine-koenig@pengutronix.de> writes:
>
> > Before this change the author was taken from the raw commit without
> > reencoding.
>
> I see people often begin with "before this change" and stop the log
> message after making a statement of a fact. I mildly dislike this style,
> especially when the resulting message does not state that it is bad (and
> if necessary why it is bad) nor state in what way the code after the
> change is good.
>
> Don't take the author name information without re-encoding
> from the raw commit object buffer.
>
> is easier to read, at least for me.
Yes, that's better. Thanks.
> > while (*buffer && *buffer != '\n') {
> > const char *eol = strchr(buffer, '\n');
> >
> > - if (eol == NULL)
> > + if (eol == NULL) {
> > eol = buffer + strlen(buffer);
> > - else
> > + } else
> > eol++;
> > if (!prefixcmp(buffer, "author "))
>
> What is this hunk for?
This is just a left-over from debugging. Removed.
> > @@ -157,20 +162,20 @@ void shortlog_add_commit(struct shortlog *log, struct commit *commit)
> > die("Missing author: %s",
> > sha1_to_hex(commit->object.sha1));
> > if (log->user_format) {
> > - struct strbuf buf = STRBUF_INIT;
> > struct pretty_print_context ctx = {0};
> > ctx.abbrev = DEFAULT_ABBREV;
> > ctx.subject = "";
> > ctx.after_subject = "";
> > ctx.date_mode = DATE_NORMAL;
> > + pretty_print_commit(CMIT_FMT_USERFORMAT, commit, &ufbuf, &ctx);
> > + buffer = ufbuf.buf;
> > +
> > + } else if (*buffer)
> > buffer++;
> > +
>
> You probably wanted to add an extra pair of {} around this "else
> if" clause instead, not the earlier one.
I removed the new line (the last changed line you quoted) instead.
Good?
> > diff --git a/t/t4201-shortlog.sh b/t/t4201-shortlog.sh
> > index 405b971..118204b 100755
> > --- a/t/t4201-shortlog.sh
> > +++ b/t/t4201-shortlog.sh
> > @@ -51,5 +51,29 @@ git log HEAD > log
> > GIT_DIR=non-existing git shortlog -w < log > out
> >
> > test_expect_success 'shortlog from non-git directory' 'test_cmp expect out'
> > +iconvfromutf8toiso885915() {
> > + printf "%s" "$@" | iconv -f UTF-8 -t ISO-8859-15
> > +}
>
> A bad use of "$@" that expands to $# individual words; you meant
> to say "$*".
OK.
> Could we please have the following inside its own test, so that
> any failure while preparing the test data is caught as an error?
I put it in the test itself. Isn't it ugly to have a test saying
something like
* ok 3: prepare shortlog encoding test
? Or is it better to see where a failure occurs?
> > +git reset --hard "$commit"
> > +git config --unset i18n.commitencoding
> > +echo 2 > a1
> > +git commit --quiet -m "set a1 to 2 and some non-ASCII chars: Äßø" --author="Jöhännës \"Dschö\" Schindëlin <Johannes.Schindelin@gmx.de>" a1
> > +
> > +git config i18n.commitencoding "ISO-8859-15"
> > +echo 3 > a1
> > +git commit --quiet -m "$(iconvfromutf8toiso885915 "set a1 to 3 and some non-ASCII chars: áæï")" --author="$(iconvfromutf8toiso885915 "Jöhännës \"Dschö\" Schindëlin <Johannes.Schindelin@gmx.de>")" a1
> > +git config --unset i18n.commitencoding
> > +
> > +git shortlog HEAD~2.. > out
> > +
> > +cat > expect << EOF
> > +Jöhännës "Dschö" Schindëlin (2):
> > + set a1 to 2 and some non-ASCII chars: Äßø
> > + set a1 to 3 and some non-ASCII chars: áæï
> > +
> > +EOF
> > +
> > +test_expect_success 'shortlog encoding' 'test_cmp expect out'
>
> t3900-i18n-commit already uses 8859-1 so if it is not too much to
> ask, it would be much nicer to have these test work between UTF-8
> and 8859-1, not -15.
>
> That way, I do not have to worry about breaking tests for people
> who were able to run existing iconv tests because they do not have
> working 8859-15.
OK.
Below is the updated patch.
Best regards
Uwe
------------------>8----------------------
From: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Subject: [PATCH] shortlog: respect commit encoding
Don't take the author name information without re-encoding from the raw
commit object buffer.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: Jiri Kosina <jkosina@suse.cz>
---
builtin-shortlog.c | 20 ++++++++++++--------
t/t4201-shortlog.sh | 23 +++++++++++++++++++++++
2 files changed, 35 insertions(+), 8 deletions(-)
diff --git a/builtin-shortlog.c b/builtin-shortlog.c
index 8aa63c7..263adc1 100644
--- a/builtin-shortlog.c
+++ b/builtin-shortlog.c
@@ -139,8 +139,13 @@ static void read_from_stdin(struct shortlog *log)
void shortlog_add_commit(struct shortlog *log, struct commit *commit)
{
const char *author = NULL, *buffer;
+ struct strbuf buf = STRBUF_INIT;
+ struct strbuf ufbuf = STRBUF_INIT;
+ struct pretty_print_context ctx = {0};
- buffer = commit->buffer;
+ pretty_print_commit(CMIT_FMT_RAW, commit, &buf, &ctx);
+
+ buffer = buf.buf;
while (*buffer && *buffer != '\n') {
const char *eol = strchr(buffer, '\n');
@@ -157,20 +162,19 @@ void shortlog_add_commit(struct shortlog *log, struct commit *commit)
die("Missing author: %s",
sha1_to_hex(commit->object.sha1));
if (log->user_format) {
- struct strbuf buf = STRBUF_INIT;
struct pretty_print_context ctx = {0};
ctx.abbrev = DEFAULT_ABBREV;
ctx.subject = "";
ctx.after_subject = "";
ctx.date_mode = DATE_NORMAL;
- pretty_print_commit(CMIT_FMT_USERFORMAT, commit, &buf, &ctx);
- insert_one_record(log, author, buf.buf);
- strbuf_release(&buf);
- return;
- }
- if (*buffer)
+ pretty_print_commit(CMIT_FMT_USERFORMAT, commit, &ufbuf, &ctx);
+ buffer = ufbuf.buf;
+
+ } else if (*buffer)
buffer++;
insert_one_record(log, author, !*buffer ? "<none>" : buffer);
+ strbuf_release(&ufbuf);
+ strbuf_release(&buf);
}
static void get_from_rev(struct rev_info *rev, struct shortlog *log)
diff --git a/t/t4201-shortlog.sh b/t/t4201-shortlog.sh
index 405b971..03b6950 100755
--- a/t/t4201-shortlog.sh
+++ b/t/t4201-shortlog.sh
@@ -52,4 +52,27 @@ GIT_DIR=non-existing git shortlog -w < log > out
test_expect_success 'shortlog from non-git directory' 'test_cmp expect out'
+iconvfromutf8toiso88591() {
+ printf "%s" "$*" | iconv -f UTF-8 -t ISO-8859-1
+}
+
+cat > expect << EOF
+Jöhännës "Dschö" Schindëlin (2):
+ set a1 to 2 and some non-ASCII chars: Äßø
+ set a1 to 3 and some non-ASCII chars: áæï
+
+EOF
+
+test_expect_success 'shortlog encoding' '
+git reset --hard "$commit" &&
+git config --unset i18n.commitencoding &&
+echo 2 > a1 &&
+git commit --quiet -m "set a1 to 2 and some non-ASCII chars: Äßø" --author="Jöhännës \"Dschö\" Schindëlin <Johannes.Schindelin@gmx.de>" a1 &&
+git config i18n.commitencoding "ISO-8859-1" &&
+echo 3 > a1 &&
+git commit --quiet -m "$(iconvfromutf8toiso88591 "set a1 to 3 and some non-ASCII chars: áæï")" --author="$(iconvfromutf8toiso88591 "Jöhännës \"Dschö\" Schindëlin <Johannes.Schindelin@gmx.de>")" a1 &&
+git config --unset i18n.commitencoding &&
+git shortlog HEAD~2.. > out &&
+test_cmp expect out'
+
test_done
--
1.6.5.3
--
Pengutronix e.K. | Uwe Kleine-König |
Industrial Linux Solutions | http://www.pengutronix.de/ |
prev parent reply other threads:[~2009-11-25 14:01 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1256757064-13669-1-git-send-email-u.kleine-koenig@pengutronix.de>
[not found] ` <alpine.LSU.2.00.0911021600530.4203@wotan.suse.de>
[not found] ` <20091103081447.GA20204@pengutronix.de>
[not found] ` <alpine.LSU.2.00.0911031144300.9988@wotan.suse.de>
[not found] ` <20091111114206.GA19652@pengutronix.de>
[not found] ` <alpine.LSU.2.00.0911111318260.15039@wotan.suse.de>
2009-11-11 14:13 ` commit log encoding [Was: [PATCH 1/2] tree-wide: fix typos "offest" -> "offset"] Uwe Kleine-König
2009-11-24 15:12 ` [PATCH] shortlog: respect commit encoding Uwe Kleine-König
2009-11-24 16:08 ` more problems with commit encoding [Was: [PATCH] shortlog: respect commit encoding] Uwe Kleine-König
2009-11-25 1:12 ` [PATCH] shortlog: respect commit encoding Junio C Hamano
2009-11-25 14:00 ` Uwe Kleine-König [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20091125140052.GA5565@pengutronix.de \
--to=u.kleine-koenig@pengutronix.de \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jkosina@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).