From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:44933) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UJ0Or-0002NB-Bf for qemu-devel@nongnu.org; Fri, 22 Mar 2013 07:44:49 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UJ0On-0000L6-3b for qemu-devel@nongnu.org; Fri, 22 Mar 2013 07:44:45 -0400 Received: from mx1.redhat.com ([209.132.183.28]:53133) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UJ0Om-0000Ky-SF for qemu-devel@nongnu.org; Fri, 22 Mar 2013 07:44:41 -0400 Message-ID: <514C44A7.9020804@redhat.com> Date: Fri, 22 Mar 2013 12:46:47 +0100 From: Laszlo Ersek MIME-Version: 1.0 References: <1363283360-26220-1-git-send-email-armbru@redhat.com> <1363283360-26220-2-git-send-email-armbru@redhat.com> <514B616D.3020604@redhat.com> <87ppyryeim.fsf@blackfin.pond.sub.org> In-Reply-To: <87ppyryeim.fsf@blackfin.pond.sub.org> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 1/4] unicode: New mod_utf8_codepoint() List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Markus Armbruster Cc: blauwirbel@gmail.com, aliguori@us.ibm.com, qemu-devel@nongnu.org On 03/22/13 10:23, Markus Armbruster wrote: > Laszlo Ersek writes: >> On 03/14/13 18:49, Markus Armbruster wrote: >>> +{ >>> + static int min_cp[5] = { 0x80, 0x800, 0x10000, 0x200000, 0x4000000 }; >>> + const unsigned char *p; >>> + unsigned byte, mask, len, i; >>> + int cp; >>> + >>> + if (n == 0) { >>> + *end = (char *)s; >>> + return 0; >>> + } >> >> This is a special case (we return the code point U+0000 after looking at >> zero bytes); we can probably expect the caller to know about n==0. > > We could make it an error instead. What's your gut feeling? (If the question still stands -- maybe it doesn't any more, considering future handling of '\0':) I guess this function would be called in a loop, with increasing "s" and decreasing "n" values. "end" can only be checked after the first call. If you write a loop that checks "end" in the controlling expression, then accepting n==0 without error is useful. If you write a loop that checks "n" in the controlling expression, then refusing n==0 is OK. I'd probably write the latter kind of loop (I like pre-testing more), but I can't say in general :) Laszlo