qemu-trivial.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Eric Blake <eblake@redhat.com>
To: Peter Maydell <peter.maydell@linaro.org>,
	Richard Henderson <rth@twiddle.net>
Cc: "patches@linaro.org" <patches@linaro.org>,
	QEMU Trivial <qemu-trivial@nongnu.org>,
	Cornelia Huck <cohuck@redhat.com>,
	 Alexander Graf <agraf@suse.de>,
	QEMU Developers <qemu-devel@nongnu.org>,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	Kamil Rytarowski <kamil@netbsd.org>,
	"qemu-ppc@nongnu.org" <qemu-ppc@nongnu.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [Qemu-trivial] [Qemu-devel] [PATCH for-2.10] Use qemu_tolower() and qemu_toupper(), not tolower() and toupper()
Date: Thu, 20 Jul 2017 16:29:03 -0500	[thread overview]
Message-ID: <2bba3c89-e4f0-655b-c9ed-aab911beaaa7@redhat.com> (raw)
In-Reply-To: <CAFEAcA8tkW2bk_O3iDO-wWpY9EUbUNkUL93OShipsvckj-jkoQ@mail.gmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 2781 bytes --]

On 07/20/2017 04:03 PM, Peter Maydell wrote:
> On 20 July 2017 at 19:26, Richard Henderson <rth@twiddle.net> wrote:
>> On 07/20/2017 06:31 AM, Peter Maydell wrote:
>>>
>>> gdbstub.c:914:13: warning: array subscript has type 'char'
>>>
>>> This reflects the fact that toupper() and tolower() give
>>> undefined behaviour if they are passed a value that isn't
>>> a valid 'unsigned char' or EOF.
>>
>>
>> Not saying we shouldn't use qemu_tolower etc, but this statement is not true
>> at all.  Officially, the argument to toupper and tolower is type int.
> 
> (I was going to quote POSIX here but I see Eric has done so already.)
> 
>> This sounds like a bug in NetBSD -- though it may not even be that, as they
>> may have done something clever and put the symbol in the middle of the data.
>> A trick that worked before compiler warnings got smarter.
> 
> https://ftp.netbsd.org/pub/NetBSD/NetBSD-current/src/sys/sys/ctype_inline.h
> The implementation is
>   #define toupper(c) ((int)((_toupper_tab_ + 1)[(c)]))
> 
> (where I assume _toupper_tab_ is the obvious array.)

In fact, if NetBSD's implementation is anything like Cygwin's, then it
is a safe bet that _toupper_tab_ is a symbol that lands in the middle of
a larger 384-byte array, such that _toupper_tab_[-1] (for (signed
char)'\xfe') happens to resolve to a valid image address, and will have
the same contents as _toupper_tab_[255] (for (unsigned char)'\xfe'),
precisely to cater to so many clueless programs have forgotten about
signed char widening to negative values (even though the standard says
the behavior is undefined, quality-of-implementation demands that you
try to do the sane thing anyway).

The table is of course only MOSTLY symmetric; there are some single-byte
locales where isspace((unsigned char)'\xff') is true but isspace(EOF) is
false (in using the magic-middle-of-array locations, _tab_[0] cannot
always match _tab_[256] in all locales).  And it is wraparound where
'\xff' collides with -1 that explains WHY C99 documents the range to be
ints corresponding to unsigned char.

Side note: NetBSD has a bug in their ctype.h.
toupper(INT64_C(0x100000001)) is supposed to be the same as toupper(1),
since POSIX requires toupper() to be available as a function (and
permits a macro as well).  A function with an int parameter must
gracefully truncate a 64-bit input down to 32-bits, so the macro must do
the same; but the NetBSD macro uses all 64 bits to dereference into
garbage memory locations.  Fortunately, no one runs into the bug in real
life because no one really tries to pass 64-bit numbers to ctype macros.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 619 bytes --]

  reply	other threads:[~2017-07-20 21:29 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-20 16:31 [Qemu-trivial] [PATCH for-2.10] Use qemu_tolower() and qemu_toupper(), not tolower() and toupper() Peter Maydell
2017-07-20 17:57 ` [Qemu-trivial] [Qemu-devel] " Eric Blake
2017-07-20 18:08 ` [Qemu-trivial] " Christian Borntraeger
2017-07-20 18:26 ` Richard Henderson
2017-07-20 18:48   ` [Qemu-trivial] [Qemu-devel] " Eric Blake
2017-07-20 18:57     ` Richard Henderson
2017-07-20 19:04       ` Eric Blake
2017-07-20 19:14         ` Richard Henderson
2017-07-20 21:03   ` [Qemu-trivial] " Peter Maydell
2017-07-20 21:29     ` Eric Blake [this message]
2017-07-20 21:42       ` [Qemu-trivial] [Qemu-devel] " Peter Maydell
2017-07-20 23:58 ` [Qemu-trivial] " David Gibson
2017-07-21 10:23 ` [Qemu-trivial] [Qemu-devel] " Peter Maydell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2bba3c89-e4f0-655b-c9ed-aab911beaaa7@redhat.com \
    --to=eblake@redhat.com \
    --cc=agraf@suse.de \
    --cc=borntraeger@de.ibm.com \
    --cc=cohuck@redhat.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=kamil@netbsd.org \
    --cc=patches@linaro.org \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    --cc=qemu-trivial@nongnu.org \
    --cc=rth@twiddle.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).