From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:58856)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <afaerber@suse.de>) id 1aIex2-0000hq-0J
	for qemu-devel@nongnu.org; Mon, 11 Jan 2016 11:04:16 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <afaerber@suse.de>) id 1aIewx-0006ch-1T
	for qemu-devel@nongnu.org; Mon, 11 Jan 2016 11:04:11 -0500
Received: from mx2.suse.de ([195.135.220.15]:53999)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <afaerber@suse.de>) id 1aIeww-0006cM-C1
	for qemu-devel@nongnu.org; Mon, 11 Jan 2016 11:04:06 -0500
References: <1450452647-118105-1-git-send-email-imammedo@redhat.com>
	<1450452647-118105-3-git-send-email-imammedo@redhat.com>
	<20151218164823.GH7228@redhat.com> <567441AA.1010309@redhat.com>
	<87wpsbwj9k.fsf@blackfin.pond.sub.org>
From: =?UTF-8?Q?Andreas_F=c3=a4rber?= <afaerber@suse.de>
Message-ID: <5693D271.3090608@suse.de>
Date: Mon, 11 Jan 2016 17:04:01 +0100
MIME-Version: 1.0
In-Reply-To: <87wpsbwj9k.fsf@blackfin.pond.sub.org>
Content-Type: text/plain; charset=iso-8859-15
Content-Transfer-Encoding: quoted-printable
Subject: Re: [Qemu-devel] [PATCH 2/2] qdev: safely fail device_add if unable
 to allocate device
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Markus Armbruster <armbru@redhat.com>, Eric Blake <eblake@redhat.com>, Igor Mammedov <imammedo@redhat.com>
Cc: pbonzini@redhat.com, qemu-devel@nongnu.org

Am 18.12.2015 um 22:15 schrieb Markus Armbruster:
> Eric Blake <eblake@redhat.com> writes:
>> On 12/18/2015 09:48 AM, Daniel P. Berrange wrote:
>>> On Fri, Dec 18, 2015 at 04:30:47PM +0100, Igor Mammedov wrote:
>>>> qdev_device_add() currently uses object_new() which
>>>> will abort if there memory allocation for device instance
>>>> fails. While it's fine it startup, it is not desirable
>>>> diring hotplug.
>>>>
>>>> Try to allocate memory for object first and fail safely
>>>> if allocation fails.
>>
>>> This just avoids one small malloc failure.
>>>
>>>> +    object_initialize(dev, obj_size, driver);
>>>
>>> This is going to call g_new many more times, so you'll
>>> still hit OOM almost immediately. eg the call to
>>> g_hash_table_new_full() in object_initialize_with_type
>>> will abort on OOM, not to mention anything run in a
>>> instance constructor function registered against the
>>> class. There's no way to avoid this given that we have
>>> chosen to use GLib in QEMU, so I don't really see any
>>> point in replacing the 'object_new' call with g_try_malloc
>=20
> Seconded.
>=20
>> We just had a recent thread on it, and I tend to agree that this serie=
s
>> isn't helpful.
>>
>> https://lists.gnu.org/archive/html/qemu-devel/2015-12/threads.html#012=
38
>=20
> Plenty of arguments there why recovering from scattered random
> allocation failures isn't useful, and recovering from all of them isn't
> practical.  Limit recovery attempts to big allocations, and spend (some
> of) the saved cycles on actually testing the recovery code.

Jumping in late... I had done work into this direction around 2012/2013
and I believe that changing the hotplug allocation is the right thing to
do here.

http://patchwork.ozlabs.org/patch/269595/

Igor's error handling could still use some improvement (at the time we
did not have the out argument) and my suggested solution was a similar
one to error_abort (pre-allocation and special handling), to avoid
allocation of Error objects and strings on OOM.

Daniel, any allocations other than the core QOM API inside an
instance_init function are plain wrong, has been pointed out to patch
authors and is not an argument for not handling hot-plug errors/design
properly. My huge QOM conversions always had in mind that instance_init
cannot do random allocations and moved them to realize, which may fail,
including for OOM, and should be handled gracefully for hot-plug. For
startup I don't see it as critical - it may be inconvenient not to get a
nice error message but you don't risk losing the guest's data.

Now to the claim that this is just a small allocation. If some 20-char
string allocation fails, there's little QEMU can realistically do
recovery-wise. But a PowerPCCPU device for instance comes with huge
multi-level instruction-dispatch tables even in KVM mode. Therefore my
opinion that this user-initiated scenario and thus code location is
exactly the one we need to handle, even if many others remain unhandled.
It's also why CPU hot-plug needs to take QOM design into account, and
proposals parametrizing core count etc. make size precalculation tricky.

Regards,
Andreas

--=20
SUSE Linux GmbH, Maxfeldstr. 5, 90409 N=FCrnberg, Germany
GF: Felix Imend=F6rffer, Jane Smithard, Graham Norton; HRB 21284 (AG N=FC=
rnberg)