git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Uwe Kleine-König" <u.kleine-koenig@pengutronix.de>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, Jiri Kosina <jkosina@suse.cz>
Subject: Re: [PATCH] shortlog: respect commit encoding
Date: Wed, 25 Nov 2009 15:00:52 +0100	[thread overview]
Message-ID: <20091125140052.GA5565@pengutronix.de> (raw)
In-Reply-To: <7vfx8376hd.fsf@alter.siamese.dyndns.org>

Hello Junio,

On Tue, Nov 24, 2009 at 05:12:14PM -0800, Junio C Hamano wrote:
> Uwe Kleine-König  <u.kleine-koenig@pengutronix.de> writes:
> 
> > Before this change the author was taken from the raw commit without
> > reencoding.
> 
> I see people often begin with "before this change" and stop the log
> message after making a statement of a fact.  I mildly dislike this style,
> especially when the resulting message does not state that it is bad (and
> if necessary why it is bad) nor state in what way the code after the
> change is good.
> 
> 	Don't take the author name information without re-encoding
>         from the raw commit object buffer.
> 
> is easier to read, at least for me.
Yes, that's better.  Thanks.
 
> >  	while (*buffer && *buffer != '\n') {
> >  		const char *eol = strchr(buffer, '\n');
> >  
> > -		if (eol == NULL)
> > +		if (eol == NULL) {
> >  			eol = buffer + strlen(buffer);
> > -		else
> > +		} else
> >  			eol++;
> >  		if (!prefixcmp(buffer, "author "))
> 
> What is this hunk for?
This is just a left-over from debugging.  Removed.
 
> > @@ -157,20 +162,20 @@ void shortlog_add_commit(struct shortlog *log, struct commit *commit)
> >  		die("Missing author: %s",
> >  		    sha1_to_hex(commit->object.sha1));
> >  	if (log->user_format) {
> > -		struct strbuf buf = STRBUF_INIT;
> >  		struct pretty_print_context ctx = {0};
> >  		ctx.abbrev = DEFAULT_ABBREV;
> >  		ctx.subject = "";
> >  		ctx.after_subject = "";
> >  		ctx.date_mode = DATE_NORMAL;
> > +		pretty_print_commit(CMIT_FMT_USERFORMAT, commit, &ufbuf, &ctx);
> > +		buffer = ufbuf.buf;
> > +
> > +	} else if (*buffer)
> >  		buffer++;
> > +
> 
> You probably wanted to add an extra pair of {} around this "else
> if" clause instead, not the earlier one.
I removed the new line (the last changed line you quoted) instead.
Good?
 
> > diff --git a/t/t4201-shortlog.sh b/t/t4201-shortlog.sh
> > index 405b971..118204b 100755
> > --- a/t/t4201-shortlog.sh
> > +++ b/t/t4201-shortlog.sh
> > @@ -51,5 +51,29 @@ git log HEAD > log
> >  GIT_DIR=non-existing git shortlog -w < log > out
> >  
> >  test_expect_success 'shortlog from non-git directory' 'test_cmp expect out'
> > +iconvfromutf8toiso885915() {
> > +	printf "%s" "$@" | iconv -f UTF-8 -t ISO-8859-15
> > +}
> 
> A bad use of "$@" that expands to $# individual words; you meant
> to say "$*".
OK.
 
> Could we please have the following inside its own test, so that
> any failure while preparing the test data is caught as an error?
I put it in the test itself.  Isn't it ugly to have a test saying
something like
	
*   ok 3: prepare shortlog encoding test

?  Or is it better to see where a failure occurs?

> > +git reset --hard "$commit"
> > +git config --unset i18n.commitencoding
> > +echo 2 > a1
> > +git commit --quiet -m "set a1 to 2 and some non-ASCII chars: Äßø" --author="Jöhännës \"Dschö\" Schindëlin <Johannes.Schindelin@gmx.de>" a1
> > +
> > +git config i18n.commitencoding "ISO-8859-15"
> > +echo 3 > a1
> > +git commit --quiet -m "$(iconvfromutf8toiso885915 "set a1 to 3 and some non-ASCII chars: áæï")" --author="$(iconvfromutf8toiso885915 "Jöhännës \"Dschö\" Schindëlin <Johannes.Schindelin@gmx.de>")" a1
> > +git config --unset i18n.commitencoding
> > +
> > +git shortlog HEAD~2.. > out
> > +
> > +cat > expect << EOF
> > +Jöhännës "Dschö" Schindëlin (2):
> > +      set a1 to 2 and some non-ASCII chars: Äßø
> > +      set a1 to 3 and some non-ASCII chars: áæï
> > +
> > +EOF
> > +
> > +test_expect_success 'shortlog encoding' 'test_cmp expect out'
> 
> t3900-i18n-commit already uses 8859-1 so if it is not too much to
> ask, it would be much nicer to have these test work between UTF-8
> and 8859-1, not -15.
> 
> That way, I do not have to worry about breaking tests for people
> who were able to run existing iconv tests because they do not have
> working 8859-15.
OK.

Below is the updated patch.

Best regards
Uwe

------------------>8----------------------
From: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Subject: [PATCH] shortlog: respect commit encoding

Don't take the author name information without re-encoding from the raw
commit object buffer.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: Jiri Kosina <jkosina@suse.cz>
---
 builtin-shortlog.c  |   20 ++++++++++++--------
 t/t4201-shortlog.sh |   23 +++++++++++++++++++++++
 2 files changed, 35 insertions(+), 8 deletions(-)

diff --git a/builtin-shortlog.c b/builtin-shortlog.c
index 8aa63c7..263adc1 100644
--- a/builtin-shortlog.c
+++ b/builtin-shortlog.c
@@ -139,8 +139,13 @@ static void read_from_stdin(struct shortlog *log)
 void shortlog_add_commit(struct shortlog *log, struct commit *commit)
 {
 	const char *author = NULL, *buffer;
+	struct strbuf buf = STRBUF_INIT;
+	struct strbuf ufbuf = STRBUF_INIT;
+	struct pretty_print_context ctx = {0};
 
-	buffer = commit->buffer;
+	pretty_print_commit(CMIT_FMT_RAW, commit, &buf, &ctx);
+
+	buffer = buf.buf;
 	while (*buffer && *buffer != '\n') {
 		const char *eol = strchr(buffer, '\n');
 
@@ -157,20 +162,19 @@ void shortlog_add_commit(struct shortlog *log, struct commit *commit)
 		die("Missing author: %s",
 		    sha1_to_hex(commit->object.sha1));
 	if (log->user_format) {
-		struct strbuf buf = STRBUF_INIT;
 		struct pretty_print_context ctx = {0};
 		ctx.abbrev = DEFAULT_ABBREV;
 		ctx.subject = "";
 		ctx.after_subject = "";
 		ctx.date_mode = DATE_NORMAL;
-		pretty_print_commit(CMIT_FMT_USERFORMAT, commit, &buf, &ctx);
-		insert_one_record(log, author, buf.buf);
-		strbuf_release(&buf);
-		return;
-	}
-	if (*buffer)
+		pretty_print_commit(CMIT_FMT_USERFORMAT, commit, &ufbuf, &ctx);
+		buffer = ufbuf.buf;
+
+	} else if (*buffer)
 		buffer++;
 	insert_one_record(log, author, !*buffer ? "<none>" : buffer);
+	strbuf_release(&ufbuf);
+	strbuf_release(&buf);
 }
 
 static void get_from_rev(struct rev_info *rev, struct shortlog *log)
diff --git a/t/t4201-shortlog.sh b/t/t4201-shortlog.sh
index 405b971..03b6950 100755
--- a/t/t4201-shortlog.sh
+++ b/t/t4201-shortlog.sh
@@ -52,4 +52,27 @@ GIT_DIR=non-existing git shortlog -w < log > out
 
 test_expect_success 'shortlog from non-git directory' 'test_cmp expect out'
 
+iconvfromutf8toiso88591() {
+	printf "%s" "$*" | iconv -f UTF-8 -t ISO-8859-1
+}
+
+cat > expect << EOF
+Jöhännës "Dschö" Schindëlin (2):
+      set a1 to 2 and some non-ASCII chars: Äßø
+      set a1 to 3 and some non-ASCII chars: áæï
+
+EOF
+
+test_expect_success 'shortlog encoding' '
+git reset --hard "$commit" &&
+git config --unset i18n.commitencoding &&
+echo 2 > a1 &&
+git commit --quiet -m "set a1 to 2 and some non-ASCII chars: Äßø" --author="Jöhännës \"Dschö\" Schindëlin <Johannes.Schindelin@gmx.de>" a1 &&
+git config i18n.commitencoding "ISO-8859-1" &&
+echo 3 > a1 &&
+git commit --quiet -m "$(iconvfromutf8toiso88591 "set a1 to 3 and some non-ASCII chars: áæï")" --author="$(iconvfromutf8toiso88591 "Jöhännës \"Dschö\" Schindëlin <Johannes.Schindelin@gmx.de>")" a1 &&
+git config --unset i18n.commitencoding &&
+git shortlog HEAD~2.. > out &&
+test_cmp expect out'
+
 test_done
-- 
1.6.5.3

-- 
Pengutronix e.K.                              | Uwe Kleine-König            |
Industrial Linux Solutions                    | http://www.pengutronix.de/  |

      reply	other threads:[~2009-11-25 14:01 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1256757064-13669-1-git-send-email-u.kleine-koenig@pengutronix.de>
     [not found] ` <alpine.LSU.2.00.0911021600530.4203@wotan.suse.de>
     [not found]   ` <20091103081447.GA20204@pengutronix.de>
     [not found]     ` <alpine.LSU.2.00.0911031144300.9988@wotan.suse.de>
     [not found]       ` <20091111114206.GA19652@pengutronix.de>
     [not found]         ` <alpine.LSU.2.00.0911111318260.15039@wotan.suse.de>
2009-11-11 14:13           ` commit log encoding [Was: [PATCH 1/2] tree-wide: fix typos "offest" -> "offset"] Uwe Kleine-König
2009-11-24 15:12             ` [PATCH] shortlog: respect commit encoding Uwe Kleine-König
2009-11-24 16:08               ` more problems with commit encoding [Was: [PATCH] shortlog: respect commit encoding] Uwe Kleine-König
2009-11-25  1:12               ` [PATCH] shortlog: respect commit encoding Junio C Hamano
2009-11-25 14:00                 ` Uwe Kleine-König [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091125140052.GA5565@pengutronix.de \
    --to=u.kleine-koenig@pengutronix.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jkosina@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).