qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Huth <thuth@redhat.com>
To: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: qemu-devel@nongnu.org, "Markus Armbruster" <armbru@redhat.com>,
	"Alistair Francis" <alistair@alistair23.me>,
	"Edgar E. Iglesias" <edgar.iglesias@gmail.com>,
	"Peter Maydell" <peter.maydell@linaro.org>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Peter Xu" <peterx@redhat.com>,
	"David Hildenbrand" <david@redhat.com>,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>,
	"Francisco Iglesias" <francisco.iglesias@amd.com>,
	"Eduardo Habkost" <eduardo@habkost.net>
Subject: Re: [PATCH v2 5/5] qom/object: Limit type names to alphanumerical and some few special characters
Date: Thu, 16 Nov 2023 18:54:29 +0100	[thread overview]
Message-ID: <ee008c55-89a9-4422-9ea0-cab60a8961af@redhat.com> (raw)
In-Reply-To: <ZVYX8mcanVBl9-ho@redhat.com>

On 16/11/2023 14.24, Daniel P. Berrangé wrote:
> On Thu, Nov 16, 2023 at 02:14:54PM +0100, Thomas Huth wrote:
>> QOM names currently don't have any enforced naming rules. This
>> can be problematic, e.g. when they are used on the command line
>> for the "-device" option (where the comma is used to separate
>> properties). To avoid that such problematic type names come in
>> again, let's restrict the set of acceptable characters during the
>> type registration.
>>
>> Ideally, we'd apply here the same rules as for QAPI, i.e. all type
>> names should begin with a letter, and contain only ASCII letters,
>> digits, hyphen, and underscore. However, we already have so many
>> pre-existing types like:
>>
>>      486-x86_64-cpu
>>      cfi.pflash01
>>      power5+_v2.1-spapr-cpu-core
>>      virt-2.6-machine
>>      pc-i440fx-3.0-machine
>>
>> ... so that we have to allow "." and "+" for now, too. While the
>> dot is used in a lot of places, the "+" can fortunately be limited
>> to two classes of legacy names ("power" and "Sun-UltraSparc" CPUs).
>>
>> We also cannot enforce the rule that names must start with a letter
>> yet, since there are lot of types that start with a digit. Still,
>> at least limiting the first characters to the alphanumerical range
>> should be way better than nothing.
>>
>> Signed-off-by: Thomas Huth <thuth@redhat.com>
>> ---
>>   qom/object.c | 41 +++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 41 insertions(+)
>>
>> diff --git a/qom/object.c b/qom/object.c
>> index 95c0dc8285..571ef68950 100644
>> --- a/qom/object.c
>> +++ b/qom/object.c
>> @@ -138,9 +138,50 @@ static TypeImpl *type_new(const TypeInfo *info)
>>       return ti;
>>   }
>>   
>> +static bool type_name_is_valid(const char *name)
>> +{
>> +    const int slen = strlen(name);
>> +
>> +    g_assert(slen > 1);
>> +
>> +    /*
>> +     * Ideally, the name should start with a letter - however, we've got
>> +     * too many names starting with a digit already, so allow digits here,
>> +     * too (except '0' which is not used yet)
>> +     */
>> +    if (!g_ascii_isalnum(name[0]) || name[0] == '0') {
>> +        return false;
>> +    }
>> +
>> +    for (int i = 1; i < slen; i++) {
>> +        if (name[i] != '-' && name[i] != '_' && name[i] != '.' &&
>> +            !g_ascii_isalnum(name[i])) {
>> +            if (name[i] == '+') {
>> +                if (i == 6 && !strncmp(name, "power", 5)) {
>> +                    /* It's a legacy name like "power5+" */
>> +                    continue;
>> +                }
>> +                if (i >= 17 && !strncmp(name, "Sun-UltraSparc", 14)) {
>> +                    /* It's a legacy name like "Sun-UltraSparc-IV+" */
>> +                    continue;
>> +                }
>> +            }
>> +            return false;
>> +        }
>> +    }
> 
> Replace this big loop with strspn, which has an asm optimized impl
> in glibc
> 
>        ALPHA_LC "abcdefghijklmnopqrstuvwxyz"
>        ALPHA_UC "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
>        OTHER "0123456789-_."
> 
>        return (strspn(name, ALPHA_UC ALPHA_LC OTHER) == slen) ||
>            (g_str_has_prefix(name, "power") && slen > 6 && name[6] == '+') ||
> 	  (g_str_has_prefix(name, "Sun-UltraSparc") && slen > 17 && name[17] == '+');

It's quite hard to believe that a function that has to check each and every 
character in a string of acceptable characters is faster than a function 
that uses something like g_ascii_asalnum which can check range of characters 
in one go...

So I gave it a try, wrote two test programs, one with my implementation and 
one with yours, and looped on the function 1000000000 times. And indeed, for 
short strings (less than 30 characters), my function is about three times 
faster than the one with strspn() (mine takes ~ 13 seconds, the strspn() one 
takes ~ 39 seconds).

Interestingly, for larger strings (more than 140 characters), the strspn() 
impementation starts to perform better. They indeed must have an 
optimization that kicks in for larger strings.

Now while my implementation seems to be a little bit faster for the strings 
that we are using in QEMU, we certainly don't have 1000000000 different 
types in QEMU, but rather only 1300 or so, so the performance shouldn't 
really matter that much here. And I have to admit that your code is indeed 
more compact to read, so I'll give it a try.

  Thomas




      reply	other threads:[~2023-11-16 17:55 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-16 13:14 [PATCH v2 0/5] Limit type names to alphanumerical and some few special characters Thomas Huth
2023-11-16 13:14 ` [PATCH v2 1/5] docs/system/arm: Fix for rename of type "xlnx.bbram-ctrl" Thomas Huth
2023-11-16 13:40   ` Peter Maydell
2023-11-16 13:14 ` [PATCH v2 2/5] hw: Replace anti-social QOM type names (again) Thomas Huth
2023-11-16 13:14 ` [PATCH v2 3/5] memory: Remove "qemu:" prefix from the "qemu:ram-discard-manager" type name Thomas Huth
2023-11-16 13:24   ` David Hildenbrand
2023-11-16 15:45   ` Philippe Mathieu-Daudé
2023-11-16 13:14 ` [PATCH v2 4/5] tests/unit/test-io-task: Rename "qemu:dummy" to avoid colon in the name Thomas Huth
2023-11-16 13:25   ` Daniel P. Berrangé
2023-11-16 15:45   ` Philippe Mathieu-Daudé
2023-11-16 13:14 ` [PATCH v2 5/5] qom/object: Limit type names to alphanumerical and some few special characters Thomas Huth
2023-11-16 13:24   ` Daniel P. Berrangé
2023-11-16 17:54     ` Thomas Huth [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ee008c55-89a9-4422-9ea0-cab60a8961af@redhat.com \
    --to=thuth@redhat.com \
    --cc=alistair@alistair23.me \
    --cc=armbru@redhat.com \
    --cc=berrange@redhat.com \
    --cc=david@redhat.com \
    --cc=edgar.iglesias@gmail.com \
    --cc=eduardo@habkost.net \
    --cc=francisco.iglesias@amd.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=peterx@redhat.com \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).