From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41426) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1acDw9-00045K-8v for qemu-devel@nongnu.org; Sat, 05 Mar 2016 10:16:10 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1acDw6-0005ul-04 for qemu-devel@nongnu.org; Sat, 05 Mar 2016 10:16:09 -0500 Received: from mx1.redhat.com ([209.132.183.28]:47358) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1acDw5-0005uU-Nw for qemu-devel@nongnu.org; Sat, 05 Mar 2016 10:16:05 -0500 References: <1455900463-16007-1-git-send-email-berrange@redhat.com> <1455900463-16007-2-git-send-email-berrange@redhat.com> <56D71147.4050204@redhat.com> <20160303110135.GB32270@redhat.com> From: Max Reitz Message-ID: <56DAF82F.8060401@redhat.com> Date: Sat, 5 Mar 2016 16:15:59 +0100 MIME-Version: 1.0 In-Reply-To: <20160303110135.GB32270@redhat.com> Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="j9WISgksMHdhCi5SfgCNfNm7f8xO5Dhkj" Subject: Re: [Qemu-devel] [PATCH v1 01/10] qdict: implement a qdict_crumple method for un-flattening a dict List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Daniel P. Berrange" Cc: Paolo Bonzini , qemu-devel@nongnu.org, =?UTF-8?Q?Andreas_F=c3=a4rber?= , Markus Armbruster This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --j9WISgksMHdhCi5SfgCNfNm7f8xO5Dhkj Content-Type: multipart/mixed; boundary="F9sabQ7TTRMvFmfBXvc3CLuPmOgkqNnCd" From: Max Reitz To: "Daniel P. Berrange" Cc: qemu-devel@nongnu.org, Paolo Bonzini , Markus Armbruster , =?UTF-8?Q?Andreas_F=c3=a4rber?= Message-ID: <56DAF82F.8060401@redhat.com> Subject: Re: [Qemu-devel] [PATCH v1 01/10] qdict: implement a qdict_crumple method for un-flattening a dict References: <1455900463-16007-1-git-send-email-berrange@redhat.com> <1455900463-16007-2-git-send-email-berrange@redhat.com> <56D71147.4050204@redhat.com> <20160303110135.GB32270@redhat.com> In-Reply-To: <20160303110135.GB32270@redhat.com> --F9sabQ7TTRMvFmfBXvc3CLuPmOgkqNnCd Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On 03.03.2016 12:01, Daniel P. Berrange wrote: > On Wed, Mar 02, 2016 at 05:13:59PM +0100, Max Reitz wrote: >> On 19.02.2016 17:47, Daniel P. Berrange wrote: >>> The qdict_flatten() method will take a dict whose elements are >>> further nested dicts/lists and flatten them by concatenating >>> keys. >>> >>> The qdict_crumple() method aims todo the reverse, taking a flat >>> qdict, and turning it into a set of nested dicts/lists. It will >>> apply nesting based on the key name, with a '.' indicating a >>> new level in the hierarchy. If the keys in the nested structure >>> are all numeric, it will create a list, otherwise it will create >>> a dict. >>> >>> If the keys are a mixture of numeric and non-numeric, or the >>> numeric keys are not in strictly ascending order, an error will >>> be reported. [...] >>> + if (!child) { >>> + goto error; >>> + } >>> + >>> + qdict_put_obj(tmp2, entry->key, child); >>> + } else { >>> + qobject_incref(entry->value); >>> + qdict_put_obj(tmp2, entry->key, entry->value); >>> + } >>> + >>> + entry =3D next; >>> + } >>> + QDECREF(tmp1); >>> + >>> + /* Step 3: detect if we need to turn our dict into list */ >> >> You could use qdict_array_entries(tmp2, "") > 0 for this. >> >> If you do want to make { "0": 42, "2": 23 } and { "0": 42, "x": 23 } >> errors (my qdict_unflatten() just kept those QDicts), then you'd have = to >> check on qdict_array_entries() error whether the QDict contained an >> integer key, but that would still be simpler than the following loop a= nd >> the check in step 4. >=20 > If I use qdict_array_entries() then it does 2 O(N) iterations > of the input qdict, and to check the errors, I'll have todo > yet another iteration. My inlined code here does everything > in a single O(N) iteration instead of 3. I think that's > compelling enough to reason to not use that function. O(3N) =3D O(N) :-) Making qdict_array_entries() O(N) (pretending that O(N) and O(2N) were different things) for this case is trivial: Just omit the second loop if "" has been passed as the subqdict. So then you'd end up with "O(2N)", if you do the error checking. So you are right, there is a reason not to use qdict_array_entries(), but I'd personally still use it. I only have very limited knowledge of the whole qemu code base, but it doesn't appear to me as if QDict functions are ever used in a hot path or as if QDicts ever contain a large number of elements (with large being more than, say, 1000), especially not the kind of QDicts you'd want to crumple. Because of that, I chose to use these existing functions, even while knowing that they are probably not optimal for this case. You did propose removing qdict_array_entries() and qdict_array_split() in favor of just using this function in their stead (see my reply below), so if we do so, reusing them would obviously not be ideal any more. In that case, I'd still put this into its own static function if possible. tl;dr: I don't think runtime complexity is an issue here, but if we are going to drop qdict_array_entries() and qdict_array_split() anyway, then using them is a bad idea, obviously. It would then still be nice to put this into an own function. >>> + entry =3D qdict_first(tmp2); >>> + while (entry !=3D NULL) { >>> + next =3D qdict_next(tmp2, entry); >>> + >>> + errno =3D 0; >>> + if (qemu_strtoll(entry->key, NULL, 10, &val) =3D=3D 0) { >>> + if (!dst) { >>> + dst =3D (QObject *)qlist_new(); >>> + isList =3D true; >>> + } else if (!isList) { >>> + error_setg(errp, >>> + "Key '%s' is for a list, but previous key= is " >>> + "for a dict", entry->key); >>> + goto error; >>> + } >>> + listLen++; >>> + if (val > listMax) { >>> + listMax =3D val; >>> + } >>> + } else { >>> + if (!dst) { >>> + dst =3D (QObject *)tmp2; >>> + qobject_incref(dst); >>> + isList =3D false; >>> + } else if (isList) { >>> + error_setg(errp, >>> + "Key '%s' is for a dict, but previous key= is " >>> + "for a list", entry->key); >>> + goto error; >>> + } >>> + } >>> + >>> + entry =3D next; >>> + } >>> + >>> + /* Step 4: Turn the dict into a list */ >> >> Why not just use qdict_array_split(tmp2, &dst);? >=20 > Again qdict_array_split() is a pretty inefficiently written > function. It does multiple nested iterations over its input > giving it O(n^2) time, compared to O(n) for my code here. Strictly speaking, it's not O(n^2) but O(nm), where n is qdict_size(src) and m is the size of the resulting QList. (Side note: This function only is O(n) if qdict_get() is O(1), which it is in best case. In worst case, it's O(n), and then this code becomes O(n^2).) Again, I don't think that O(nm) is much worse than O(n) for the use case at hand, but if you are going to drop qdict_array_split() anyway, then, again, it doesn't make sense to call it at all. You could again put the code inside of the "if (isList) {}" block into an own function, but it's not that long so I'd be fine with it staying here (although it is certainly self-contained enough to look good in an own function :-)). > The only user of qdict_array_entries() and qdict_array_split() > is the block quorum driver. From a brief look at that code > I think it could probably be changed to just call this new > qdict_crumple() method, and those 2 inefficient functions > could be deleted, so we don't have duplication of code. I'm not sure. First, you do not want to crumple recursively there, so you'd have to re-flatten all the QList elements after you have crumpled them. Could be solved by adding a "bool recurse" parameter to this functi= on. Second, yes, qdict_array_entries() is used by quorum. However, it does (no longer) use qdict_array_split(); the users of that are util/qemu-config.c and blockdev.c. Replacing those with a non-recursing call to qdict_crumple() should be trivial, but quorum is going to be a bit more difficult. You could make it just call qdict_crumple(), get the size of the resulting list and then discard it, but that seems a bit over the top. All in all, I think you're right in that this function can replace the (potentially) worse qdict_array_split() function (if the caller can stop it from recursing), but I don't know whether we can get rid of qdict_array_entries() so easily. Max >>> + if (isList) { >>> + if (listLen !=3D (listMax + 1)) { >>> + error_setg(errp, "List indexes are not continuous, " >>> + "saw %zu elements but %zu largest index", >>> + listLen, listMax); >>> + goto error; >>> + } >>> + >>> + for (i =3D 0; i < listLen; i++) { >>> + char *key =3D g_strdup_printf("%zu", i); >>> + >>> + child =3D qdict_get(tmp2, key); >>> + g_free(key); >>> + if (!child) { >>> + error_setg(errp, "Unexpected missing list entry %zu"= , i); >>> + goto error; >>> + } >>> + >>> + qobject_incref(child); >>> + qlist_append_obj((QList *)dst, child); >>> + } >>> + } >>> + QDECREF(tmp2); >>> + >>> + return dst; >>> + >>> + error: >>> + QDECREF(tmp2); >>> + QDECREF(tmp1); >>> + qobject_decref(dst); >>> + return NULL; >>> +} >>> + >>> + >=20 > Regards, > Daniel >=20 --F9sabQ7TTRMvFmfBXvc3CLuPmOgkqNnCd-- --j9WISgksMHdhCi5SfgCNfNm7f8xO5Dhkj Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAEBCAAGBQJW2vgvAAoJEDuxQgLoOKytF1IIAJnMUSXJN/ZfGYy3kv7SHLS+ rAOwrA19sFlkWyYNsmgLXS+pETYdqbVxajM/sPRdZoAf4gxIh3vbUtKeBGZTiWaL 5cZDnBNwrOw3YRvRGyafAJhtvsY5GRcFEzUH1zKgq9FTR5h0or3OZhuddwB++Xlb q+ks78klFQANieA+0VZUtGSwqbi2RT5KiC+x1S32N8nblYSPQdPLttun37HLW60t jv8+qsUzYw8ei0SUvySFylj860xnxmPcsWEHJmlXITthszibpJQVmnRK3VDnBzhd lfgb+coekkaVAaMI+AgAGXgSkhMKe9PqdH+Hsn5JginBh4GCzkqd+E+rHfwwiao= =lAsX -----END PGP SIGNATURE----- --j9WISgksMHdhCi5SfgCNfNm7f8xO5Dhkj--