From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:41426)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <mreitz@redhat.com>) id 1acDw9-00045K-8v
	for qemu-devel@nongnu.org; Sat, 05 Mar 2016 10:16:10 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <mreitz@redhat.com>) id 1acDw6-0005ul-04
	for qemu-devel@nongnu.org; Sat, 05 Mar 2016 10:16:09 -0500
Received: from mx1.redhat.com ([209.132.183.28]:47358)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <mreitz@redhat.com>) id 1acDw5-0005uU-Nw
	for qemu-devel@nongnu.org; Sat, 05 Mar 2016 10:16:05 -0500
References: <1455900463-16007-1-git-send-email-berrange@redhat.com>
	<1455900463-16007-2-git-send-email-berrange@redhat.com>
	<56D71147.4050204@redhat.com> <20160303110135.GB32270@redhat.com>
From: Max Reitz <mreitz@redhat.com>
Message-ID: <56DAF82F.8060401@redhat.com>
Date: Sat, 5 Mar 2016 16:15:59 +0100
MIME-Version: 1.0
In-Reply-To: <20160303110135.GB32270@redhat.com>
Content-Type: multipart/signed; micalg=pgp-sha256;
	protocol="application/pgp-signature";
	boundary="j9WISgksMHdhCi5SfgCNfNm7f8xO5Dhkj"
Subject: Re: [Qemu-devel] [PATCH v1 01/10] qdict: implement a qdict_crumple
 method for un-flattening a dict
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: "Daniel P. Berrange" <berrange@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>, qemu-devel@nongnu.org, =?UTF-8?Q?Andreas_F=c3=a4rber?= <afaerber@suse.de>, Markus Armbruster <armbru@redhat.com>

This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
--j9WISgksMHdhCi5SfgCNfNm7f8xO5Dhkj
Content-Type: multipart/mixed; boundary="F9sabQ7TTRMvFmfBXvc3CLuPmOgkqNnCd"
From: Max Reitz <mreitz@redhat.com>
To: "Daniel P. Berrange" <berrange@redhat.com>
Cc: qemu-devel@nongnu.org, Paolo Bonzini <pbonzini@redhat.com>,
 Markus Armbruster <armbru@redhat.com>, =?UTF-8?Q?Andreas_F=c3=a4rber?=
 <afaerber@suse.de>
Message-ID: <56DAF82F.8060401@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v1 01/10] qdict: implement a qdict_crumple
 method for un-flattening a dict
References: <1455900463-16007-1-git-send-email-berrange@redhat.com>
 <1455900463-16007-2-git-send-email-berrange@redhat.com>
 <56D71147.4050204@redhat.com> <20160303110135.GB32270@redhat.com>
In-Reply-To: <20160303110135.GB32270@redhat.com>

--F9sabQ7TTRMvFmfBXvc3CLuPmOgkqNnCd
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable

On 03.03.2016 12:01, Daniel P. Berrange wrote:
> On Wed, Mar 02, 2016 at 05:13:59PM +0100, Max Reitz wrote:
>> On 19.02.2016 17:47, Daniel P. Berrange wrote:
>>> The qdict_flatten() method will take a dict whose elements are
>>> further nested dicts/lists and flatten them by concatenating
>>> keys.
>>>
>>> The qdict_crumple() method aims todo the reverse, taking a flat
>>> qdict, and turning it into a set of nested dicts/lists. It will
>>> apply nesting based on the key name, with a '.' indicating a
>>> new level in the hierarchy. If the keys in the nested structure
>>> are all numeric, it will create a list, otherwise it will create
>>> a dict.
>>>
>>> If the keys are a mixture of numeric and non-numeric, or the
>>> numeric keys are not in strictly ascending order, an error will
>>> be reported.

[...]

>>> +            if (!child) {
>>> +                goto error;
>>> +            }
>>> +
>>> +            qdict_put_obj(tmp2, entry->key, child);
>>> +        } else {
>>> +            qobject_incref(entry->value);
>>> +            qdict_put_obj(tmp2, entry->key, entry->value);
>>> +        }
>>> +
>>> +        entry =3D next;
>>> +    }
>>> +    QDECREF(tmp1);
>>> +
>>> +    /* Step 3: detect if we need to turn our dict into list */
>>
>> You could use qdict_array_entries(tmp2, "") > 0 for this.
>>
>> If you do want to make { "0": 42, "2": 23 } and { "0": 42, "x": 23 }
>> errors (my qdict_unflatten() just kept those QDicts), then you'd have =
to
>> check on qdict_array_entries() error whether the QDict contained an
>> integer key, but that would still be simpler than the following loop a=
nd
>> the check in step 4.
>=20
> If I use qdict_array_entries() then it does 2 O(N) iterations
> of the input qdict, and to check the errors, I'll have todo
> yet another iteration.  My inlined code here does everything
> in a single O(N) iteration instead of 3. I think that's
> compelling enough to reason to not use that function.

O(3N) =3D O(N) :-)

Making qdict_array_entries() O(N) (pretending that O(N) and O(2N) were
different things) for this case is trivial: Just omit the second loop if
"" has been passed as the subqdict.

So then you'd end up with "O(2N)", if you do the error checking.

So you are right, there is a reason not to use qdict_array_entries(),
but I'd personally still use it. I only have very limited knowledge of
the whole qemu code base, but it doesn't appear to me as if QDict
functions are ever used in a hot path or as if QDicts ever contain a
large number of elements (with large being more than, say, 1000),
especially not the kind of QDicts you'd want to crumple.

Because of that, I chose to use these existing functions, even while
knowing that they are probably not optimal for this case.

You did propose removing qdict_array_entries() and qdict_array_split()
in favor of just using this function in their stead (see my reply
below), so if we do so, reusing them would obviously not be ideal any
more. In that case, I'd still put this into its own static function if
possible.

tl;dr: I don't think runtime complexity is an issue here, but if we are
going to drop qdict_array_entries() and qdict_array_split() anyway, then
using them is a bad idea, obviously. It would then still be nice to put
this into an own function.

>>> +    entry =3D qdict_first(tmp2);
>>> +    while (entry !=3D NULL) {
>>> +        next =3D qdict_next(tmp2, entry);
>>> +
>>> +        errno =3D 0;
>>> +        if (qemu_strtoll(entry->key, NULL, 10, &val) =3D=3D 0) {
>>> +            if (!dst) {
>>> +                dst =3D (QObject *)qlist_new();
>>> +                isList =3D true;
>>> +            } else if (!isList) {
>>> +                error_setg(errp,
>>> +                           "Key '%s' is for a list, but previous key=
 is "
>>> +                           "for a dict", entry->key);
>>> +                goto error;
>>> +            }
>>> +            listLen++;
>>> +            if (val > listMax) {
>>> +                listMax =3D val;
>>> +            }
>>> +        } else {
>>> +            if (!dst) {
>>> +                dst =3D (QObject *)tmp2;
>>> +                qobject_incref(dst);
>>> +                isList =3D false;
>>> +            } else if (isList) {
>>> +                error_setg(errp,
>>> +                           "Key '%s' is for a dict, but previous key=
 is "
>>> +                           "for a list", entry->key);
>>> +                goto error;
>>> +            }
>>> +        }
>>> +
>>> +        entry =3D next;
>>> +    }
>>> +
>>> +    /* Step 4: Turn the dict into a list */
>>
>> Why not just use qdict_array_split(tmp2, &dst);?
>=20
> Again qdict_array_split() is a pretty inefficiently written
> function. It does multiple nested iterations over its input
> giving it O(n^2) time, compared to O(n) for my code here.

Strictly speaking, it's not O(n^2) but O(nm), where n is qdict_size(src)
and m is the size of the resulting QList.

(Side note: This function only is O(n) if qdict_get() is O(1), which it
is in best case. In worst case, it's O(n), and then this code becomes
O(n^2).)

Again, I don't think that O(nm) is much worse than O(n) for the use case
at hand, but if you are going to drop qdict_array_split() anyway, then,
again, it doesn't make sense to call it at all.

You could again put the code inside of the "if (isList) {}" block into
an own function, but it's not that long so I'd be fine with it staying
here (although it is certainly self-contained enough to look good in an
own function :-)).

> The only user of qdict_array_entries() and qdict_array_split()
> is the block quorum driver. From a brief look at that code
> I think it could probably be changed to just call this new
> qdict_crumple() method, and those 2 inefficient functions
> could be deleted, so we don't have duplication of code.

I'm not sure. First, you do not want to crumple recursively there, so
you'd have to re-flatten all the QList elements after you have crumpled
them. Could be solved by adding a "bool recurse" parameter to this functi=
on.

Second, yes, qdict_array_entries() is used by quorum. However, it does
(no longer) use qdict_array_split(); the users of that are
util/qemu-config.c and blockdev.c. Replacing those with a non-recursing
call to qdict_crumple() should be trivial, but quorum is going to be a
bit more difficult. You could make it just call qdict_crumple(), get the
size of the resulting list and then discard it, but that seems a bit
over the top.

All in all, I think you're right in that this function can replace the
(potentially) worse qdict_array_split() function (if the caller can stop
it from recursing), but I don't know whether we can get rid of
qdict_array_entries() so easily.

Max

>>> +    if (isList) {
>>> +        if (listLen !=3D (listMax + 1)) {
>>> +            error_setg(errp, "List indexes are not continuous, "
>>> +                       "saw %zu elements but %zu largest index",
>>> +                       listLen, listMax);
>>> +            goto error;
>>> +        }
>>> +
>>> +        for (i =3D 0; i < listLen; i++) {
>>> +            char *key =3D g_strdup_printf("%zu", i);
>>> +
>>> +            child =3D qdict_get(tmp2, key);
>>> +            g_free(key);
>>> +            if (!child) {
>>> +                error_setg(errp, "Unexpected missing list entry %zu"=
, i);
>>> +                goto error;
>>> +            }
>>> +
>>> +            qobject_incref(child);
>>> +            qlist_append_obj((QList *)dst, child);
>>> +        }
>>> +    }
>>> +    QDECREF(tmp2);
>>> +
>>> +    return dst;
>>> +
>>> + error:
>>> +    QDECREF(tmp2);
>>> +    QDECREF(tmp1);
>>> +    qobject_decref(dst);
>>> +    return NULL;
>>> +}
>>> +
>>> +
>=20
> Regards,
> Daniel
>=20


--F9sabQ7TTRMvFmfBXvc3CLuPmOgkqNnCd--

--j9WISgksMHdhCi5SfgCNfNm7f8xO5Dhkj
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQEcBAEBCAAGBQJW2vgvAAoJEDuxQgLoOKytF1IIAJnMUSXJN/ZfGYy3kv7SHLS+
rAOwrA19sFlkWyYNsmgLXS+pETYdqbVxajM/sPRdZoAf4gxIh3vbUtKeBGZTiWaL
5cZDnBNwrOw3YRvRGyafAJhtvsY5GRcFEzUH1zKgq9FTR5h0or3OZhuddwB++Xlb
q+ks78klFQANieA+0VZUtGSwqbi2RT5KiC+x1S32N8nblYSPQdPLttun37HLW60t
jv8+qsUzYw8ei0SUvySFylj860xnxmPcsWEHJmlXITthszibpJQVmnRK3VDnBzhd
lfgb+coekkaVAaMI+AgAGXgSkhMKe9PqdH+Hsn5JginBh4GCzkqd+E+rHfwwiao=
=lAsX
-----END PGP SIGNATURE-----

--j9WISgksMHdhCi5SfgCNfNm7f8xO5Dhkj--