From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52038) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zz62w-0005lv-D7 for qemu-devel@nongnu.org; Wed, 18 Nov 2015 11:57:27 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Zz62t-0003rf-7B for qemu-devel@nongnu.org; Wed, 18 Nov 2015 11:57:26 -0500 Received: from mx1.redhat.com ([209.132.183.28]:44259) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zz62s-0003rb-Vn for qemu-devel@nongnu.org; Wed, 18 Nov 2015 11:57:23 -0500 References: <1447836791-369-1-git-send-email-eblake@redhat.com> <1447836791-369-22-git-send-email-eblake@redhat.com> <87oaerttjc.fsf@blackfin.pond.sub.org> From: Eric Blake Message-ID: <564CADEC.1090902@redhat.com> Date: Wed, 18 Nov 2015 09:57:16 -0700 MIME-Version: 1.0 In-Reply-To: <87oaerttjc.fsf@blackfin.pond.sub.org> Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="a8l0pJQnQTQ0jgAc2rXeeHfo5twaiHr2t" Subject: Re: [Qemu-devel] [PATCH v12 21/36] qapi: Tighten the regex on valid names List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Markus Armbruster Cc: qemu-devel@nongnu.org, Michael Roth This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --a8l0pJQnQTQ0jgAc2rXeeHfo5twaiHr2t Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On 11/18/2015 04:51 AM, Markus Armbruster wrote: > Eric Blake writes: >=20 >> We already documented that qapi names should match specific >> patterns (such as starting with a letter unless it was an enum >> value or a downstream extension). Tighten that from a suggestion >> into a hard requirement, which frees up names beginning with a >> single underscore for qapi internal usage. >=20 > If we care enough about naming conventions to document them, enforcing > them makes only sense. >=20 >> Also restrict things >> to avoid 'a__b' or 'a--b' (that is, dash and underscore must not >> occur consecutively). >=20 > I feel this is entering "foolish names" territory, where things like > "IcAnFiNdNaMeSeVeNlEsSrEaDaBlEtHaNStudlyCaps" live. Catching fools IhAdToOmUcHfUnReAdInGtHaT [It's amazing that scholars can ever manage to correctly interpret old written languages, thanks to omitted word breaks (scriptio continua) or omitted vowels (such as in Hebrew)] > automatically is generally futile, they're too creative :) >=20 > Let's drop this part. Sure, I can do that. I'll post a fixup patch, as it will affect docs too.= >=20 >> Add a new test, reserved-member-underscore, to demonstrate the >> tighter checking. >> >> Signed-off-by: Eric Blake >> >> --- >> v12: new patch >> --- >> +incompatibly in a future release. All names must begin with a letter= , >> +and contain only ASCII letters, digits, dash, and underscore, where >> +dash and underscore cannot occur consecutively. There are two Namely, right here. >> +# Names must be letters, numbers, -, and _. They must start with let= ter, >> +# except for downstream extensions which must start with __RFQDN_. >> +# Dots are only valid in the downstream extension prefix. >> +valid_name =3D re.compile('^(__[a-zA-Z][a-zA-Z0-9.]*_)?' >> + '[a-zA-Z][a-zA-Z0-9]*([_-][a-zA-Z0-9]+)*$') >=20 > This regexp consists of two parts: optional __RFQDN_ prefix and the nam= e > proper. >=20 > The latter stays simpler if we don't attempt to catch adjacent [-_]. >=20 > The former isn't quite right. According to RFC 822 Appendix 1 - Domain= > Name Syntax Specification: >=20 > * We must accept '-' in addition to digits. >=20 > * We require RFQDN to start with a letter. This is still a loose match= =2E > The tight match is "labels separated by dots", where labels consist o= f > letters, digits and '-', starting with a letter. I wouldn't bother > checking first characters are letters at all. >=20 > Recommend >=20 > valid_name =3D re.compile('^(__[a-zA-Z0-9.-]+_)?' > '[a-zA-Z][a-zA-Z0-9_-]*$') Sure, that works for me. It's tighter than what we had before, but looser than what I proposed so that it allows more RFQDN. It potentially lets users confuse us by abusing 'foo__max' or similar, but I'm less worried about that abuse - it's okay to stop the blatantly obvious mistakes without worrying about the corner cases, if it makes the maintenance easier for the cases we do catch. >=20 >> >> >> def check_name(expr_info, source, name, allow_optional=3DFalse, >> @@ -374,8 +376,8 @@ def check_name(expr_info, source, name, allow_opti= onal=3DFalse, >> % (source, name)) >> # Enum members can start with a digit, because the generated C >> # code always prefixes it with the enum name >> - if enum_member: >> - membername =3D '_' + membername >> + if enum_member and membername[0].isdigit(): >=20 > What's wrong with the old condition? >=20 >> + membername =3D 'D' + membername The old code prepended a lone '_', which doesn't work any more with the tighter valid_name regex, so we have to use some other prefix (I picked 'D'). >> # Reserve the entire 'q_' namespace for c_name() >> if not valid_name.match(membername) or \ >> c_name(membername, False).startswith('q_'): The other thing is that I realized that an enum value of 'q-int' was getting munged to '_q-int' which no longer gets flagged by the c_name() filter. But neither would 'Dq-int' get flagged. So limiting the munging to just enum values that start with a digit (where we know it doesn't start with q) means we don't weaken the second condition. I guess I need to call that out in the commit message; all the more reason for me to send a fixup. --=20 Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org --a8l0pJQnQTQ0jgAc2rXeeHfo5twaiHr2t Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 Comment: Public key at http://people.redhat.com/eblake/eblake.gpg Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBCAAGBQJWTK3tAAoJEKeha0olJ0NqxJoIAKQDWE7Oy9Pw9Py0uJ5MGHhc ATZXdrUCGuyb/edkzea3z9hIgdJ79+1/yIm0LQEIM4uD21zJ2oTe/FqG8thJS6V+ /X3wFkOUdq9cnB8MZPZkjRu3krqcKU/o1hpflgWPl60eKNJfERsAuUkjsYVgIXCk Lq2IpE2vYDlnvIU4LePEkl5mfWKfajY/lcZGzPGrSqeWPz4154PTAYYUOX2BGHrk zTnn+8FDQENV2A95K87d6/8WShaxqPx9rfboUBbBHjM1LWWk8ozyo7MNAD/sMF/Q Kj2JPSRyrUMQZEia4boQDpla/G95HCIallPtHOeSmI0EYIBym4B1Fwa+MKiQZos= =TzUi -----END PGP SIGNATURE----- --a8l0pJQnQTQ0jgAc2rXeeHfo5twaiHr2t--