From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:46346) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a0xgw-0003nT-3o for qemu-devel@nongnu.org; Mon, 23 Nov 2015 15:26:27 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a0xgs-0001lc-3g for qemu-devel@nongnu.org; Mon, 23 Nov 2015 15:26:26 -0500 Received: from mx1.redhat.com ([209.132.183.28]:45780) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a0xgr-0001lS-VE for qemu-devel@nongnu.org; Mon, 23 Nov 2015 15:26:22 -0500 Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) by mx1.redhat.com (Postfix) with ESMTPS id 0044CC0F1CF5 for ; Mon, 23 Nov 2015 20:26:20 +0000 (UTC) References: <1448300659-23559-1-git-send-email-pbonzini@redhat.com> <1448300659-23559-3-git-send-email-pbonzini@redhat.com> <565353ED.1090502@redhat.com> <56537170.8070600@redhat.com> From: Laszlo Ersek Message-ID: <5653766A.3090300@redhat.com> Date: Mon, 23 Nov 2015 21:26:18 +0100 MIME-Version: 1.0 In-Reply-To: <56537170.8070600@redhat.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH v2 2/4] qjson: do not save/restore contexts List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Eric Blake , Paolo Bonzini , qemu-devel@nongnu.org Cc: armbru@redhat.com On 11/23/15 21:05, Eric Blake wrote: > On 11/23/2015 10:59 AM, Laszlo Ersek wrote: >> On 11/23/15 18:44, Paolo Bonzini wrote: >>> JSON is LL(1) and our parser indeed needs only 1 token lookahead. >>> Saving the parser context is mostly unnecessary; we can replace it >>> with peeking at the next token, or remove it altogether when the >>> restore only happens on errors. The token list is destroyed anyway >>> on errors. >>> >>> The only interesting thing is that parse_keyword always eats >>> a TOKEN_KEYWORD, even if it is invalid, so it must come last in >>> parse_value (otherwise, NULL is returned, parse_literal is invoked >>> and it tries to peek beyond end of input). This is caught by >>> /errors/unterminated/literal, which actually checks for an unterminat= ed >>> keyword. =E0=B2=A0_=E0=B2=A0 >> >> Is it accepted practice to put UTF-8 in commit messages? (Or, actually= , >> anywhere in patches, except maybe the notes section?) >> >=20 > Git handles UTF-8 just fine (and for any other encoding, properly > transmitted in the email, git transcodes to UTF-8 before writing it int= o > the repository). >=20 Yes, I know. I use latin2: $ locale LANG=3D LC_CTYPE=3Dhu_HU.ISO8859-2 LC_NUMERIC=3D"POSIX" LC_TIME=3D"POSIX" LC_COLLATE=3D"POSIX" LC_MONETARY=3D"POSIX" LC_MESSAGES=3D"POSIX" LC_PAPER=3D"POSIX" LC_NAME=3D"POSIX" LC_ADDRESS=3D"POSIX" LC_TELEPHONE=3D"POSIX" LC_MEASUREMENT=3D"POSIX" LC_IDENTIFICATION=3D"POSIX" LC_ALL=3D and from my git config: [i18n] logOutputEncoding =3D latin2 commitencoding =3D latin2 This works very well -- as long as it doesn't choke on something outside of latin2 --, both the glibc locale support and git are doing their jobs perfectly fine; my question concerned any other users who decided to stay with single-byte encodings (with an ASCII subset). (I believe that RFCs stick with ASCII to this day, and I also think that our source code and docs/ should stick with ASCII; but I know I can't plausibly argue for the same in commit messages, assuming I'm alone with that anyway. BTW I should have written "non-ASCII Unicode code points" in my original question, rather than "UTF-8".) Thanks! Laszlo