From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:58910) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UbGfx-0006kc-PT for qemu-devel@nongnu.org; Sat, 11 May 2013 16:45:56 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UbGfw-0000zQ-DP for qemu-devel@nongnu.org; Sat, 11 May 2013 16:45:53 -0400 Received: from mail-ob0-x234.google.com ([2607:f8b0:4003:c01::234]:49551) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UbGfw-0000zK-67 for qemu-devel@nongnu.org; Sat, 11 May 2013 16:45:52 -0400 Received: by mail-ob0-f180.google.com with SMTP id xk17so3126751obc.25 for ; Sat, 11 May 2013 13:45:51 -0700 (PDT) From: Anthony Liguori In-Reply-To: <518E8F60.6020704@redhat.com> References: <1368211675-2912-1-git-send-email-aliguori@us.ibm.com> <20130510214740.GN31148@hall.aurel32.net> <877gj65to0.fsf@codemonkey.ws> <518E8F60.6020704@redhat.com> Date: Sat, 11 May 2013 15:45:49 -0500 Message-ID: <87vc6pjldu.fsf@codemonkey.ws> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH for-1.5] qom: optimize casting to leaf class and parent class List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini , qemu-devel , Aurelien Jarno Paolo Bonzini writes: > Il 11/05/2013 00:58, Anthony Liguori ha scritto: >> Aurelien Jarno writes: >>=20 >>> On Fri, May 10, 2013 at 01:47:55PM -0500, Anthony Liguori wrote: >>>> Most QOM types use type_register_static but we still strdup the >>>> passed data. However, the original pointers are useful because >>>> GCC is pretty good about collapsing strings so its very likely any >>>> use of the pointer will end up being that same address. >>>> >>>> IOW, with a little trickery, we can compare types by just comparing >>>> strings and in fact that's what we do here. >>>> >>>> We do this for the two most common cases, casting to a leaf class >>>> or to the parent class. >>>> >>>> With these two changes, I see a decrease from around 2 hash table >>>> lookups to only a thousand with no run time lookups at all. >>>> >>>> Cc: Paolo Bonzini >>>> Cc: Aurelien Jarno >>>> Cc: Andreas F=C3=A4rber >>>> Reported-by: Aurelien Jarno >>>> Signed-off-by: Anthony Liguori >>>> --- >>>> Aurelien, could you please try this patch with your PPC test case? >>>> --- >>>> qom/object.c | 16 ++++++++++++++-- >>>> 1 file changed, 14 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/qom/object.c b/qom/object.c >>>> index 75e6aac..5ecfd28 100644 >>>> --- a/qom/object.c >>>> +++ b/qom/object.c >>>> @@ -132,7 +132,13 @@ TypeImpl *type_register(const TypeInfo *info) >>>>=20=20 >>>> TypeImpl *type_register_static(const TypeInfo *info) >>>> { >>>> - return type_register(info); >>>> + TypeImpl *impl; >>>> + >>>> + impl =3D type_register(info); >>>> + impl->name =3D info->name; >>>> + impl->parent =3D info->parent; >>>> + >>>> + return impl; >>>> } > > This is ok with a comment. > >>>> static TypeImpl *type_get_by_name(const char *name) >>>> @@ -449,10 +455,16 @@ Object *object_dynamic_cast_assert(Object *obj, = const char *typename) >>>> ObjectClass *object_class_dynamic_cast(ObjectClass *class, >>>> const char *typename) >>>> { >>>> - TypeImpl *target_type =3D type_get_by_name(typename); >>>> + TypeImpl *target_type; >>>> TypeImpl *type =3D class->type; >>>> ObjectClass *ret =3D NULL; >>>>=20=20 >>>> + if (type->name =3D=3D typename || type->parent =3D=3D typename) { >>>> + return class; >>>> + } > > I prefer my patch 3/9. With the hunk above, it works fine for the > simple case of casts in a device model's callbacks (testing type->parent > would almost always fail, so it is not worthwhile). > > Unfortunately, strcmp is just as bad as a hashtable lookup (both are > O(n) in the size the string, instead of O(1)). FWIW, I've been experimenting with Aurelien's image. The issue here is that the PPC CPUs are 4 levels deep consisting of the leaf model, a PPC base model, the PPC CPU model, then the CPU object. So these checks (regardless of how they're done) will never catch because we're attempting to cast the object to either the PPC CPU model or CPU object model. I've been experimenting with a front side cache on top of the hash table. I see a nice improvement but still not as good as your patch. I'll continue experimenting. I fear that the only way to get parity is to expose the TypeImpl to the cast macro to avoid the hash lookup entirely. Unfortunately that's a touch every change. Regards, Anthony Liguori > > Paolo > >>>> + target_type =3D type_get_by_name(typename); >>>> + >>>> if (!target_type) { >>>> /* target class type unknown, so fail the cast */ >>>> return NULL; >>> >>> Unfortunately it doesn't fix the problem. I only see a 0.5% improvement, >>> which might be in the noise. I still see g_hash_table_lookup and >>> g_str_hash quite high in perf top. >>=20 >> I was afraid of this. I assume the cast comes somewhere other than >> where the type was registered. >>=20 >> This patch should address that. Could you post an image too? Then I >> don't have to keep bugging you with updated patches. >>=20 >>=20 >>=20 >>=20 >> Regards, >>=20 >> Anthony Liguori >>=20 >>> >>> --=20 >>> Aurelien Jarno GPG: 1024D/F1BCDB73 >>> aurelien@aurel32.net http://www.aurel32.net