From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:58856) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aIex2-0000hq-0J for qemu-devel@nongnu.org; Mon, 11 Jan 2016 11:04:16 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aIewx-0006ch-1T for qemu-devel@nongnu.org; Mon, 11 Jan 2016 11:04:11 -0500 Received: from mx2.suse.de ([195.135.220.15]:53999) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aIeww-0006cM-C1 for qemu-devel@nongnu.org; Mon, 11 Jan 2016 11:04:06 -0500 References: <1450452647-118105-1-git-send-email-imammedo@redhat.com> <1450452647-118105-3-git-send-email-imammedo@redhat.com> <20151218164823.GH7228@redhat.com> <567441AA.1010309@redhat.com> <87wpsbwj9k.fsf@blackfin.pond.sub.org> From: =?UTF-8?Q?Andreas_F=c3=a4rber?= Message-ID: <5693D271.3090608@suse.de> Date: Mon, 11 Jan 2016 17:04:01 +0100 MIME-Version: 1.0 In-Reply-To: <87wpsbwj9k.fsf@blackfin.pond.sub.org> Content-Type: text/plain; charset=iso-8859-15 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH 2/2] qdev: safely fail device_add if unable to allocate device List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Markus Armbruster , Eric Blake , Igor Mammedov Cc: pbonzini@redhat.com, qemu-devel@nongnu.org Am 18.12.2015 um 22:15 schrieb Markus Armbruster: > Eric Blake writes: >> On 12/18/2015 09:48 AM, Daniel P. Berrange wrote: >>> On Fri, Dec 18, 2015 at 04:30:47PM +0100, Igor Mammedov wrote: >>>> qdev_device_add() currently uses object_new() which >>>> will abort if there memory allocation for device instance >>>> fails. While it's fine it startup, it is not desirable >>>> diring hotplug. >>>> >>>> Try to allocate memory for object first and fail safely >>>> if allocation fails. >> >>> This just avoids one small malloc failure. >>> >>>> + object_initialize(dev, obj_size, driver); >>> >>> This is going to call g_new many more times, so you'll >>> still hit OOM almost immediately. eg the call to >>> g_hash_table_new_full() in object_initialize_with_type >>> will abort on OOM, not to mention anything run in a >>> instance constructor function registered against the >>> class. There's no way to avoid this given that we have >>> chosen to use GLib in QEMU, so I don't really see any >>> point in replacing the 'object_new' call with g_try_malloc >=20 > Seconded. >=20 >> We just had a recent thread on it, and I tend to agree that this serie= s >> isn't helpful. >> >> https://lists.gnu.org/archive/html/qemu-devel/2015-12/threads.html#012= 38 >=20 > Plenty of arguments there why recovering from scattered random > allocation failures isn't useful, and recovering from all of them isn't > practical. Limit recovery attempts to big allocations, and spend (some > of) the saved cycles on actually testing the recovery code. Jumping in late... I had done work into this direction around 2012/2013 and I believe that changing the hotplug allocation is the right thing to do here. http://patchwork.ozlabs.org/patch/269595/ Igor's error handling could still use some improvement (at the time we did not have the out argument) and my suggested solution was a similar one to error_abort (pre-allocation and special handling), to avoid allocation of Error objects and strings on OOM. Daniel, any allocations other than the core QOM API inside an instance_init function are plain wrong, has been pointed out to patch authors and is not an argument for not handling hot-plug errors/design properly. My huge QOM conversions always had in mind that instance_init cannot do random allocations and moved them to realize, which may fail, including for OOM, and should be handled gracefully for hot-plug. For startup I don't see it as critical - it may be inconvenient not to get a nice error message but you don't risk losing the guest's data. Now to the claim that this is just a small allocation. If some 20-char string allocation fails, there's little QEMU can realistically do recovery-wise. But a PowerPCCPU device for instance comes with huge multi-level instruction-dispatch tables even in KVM mode. Therefore my opinion that this user-initiated scenario and thus code location is exactly the one we need to handle, even if many others remain unhandled. It's also why CPU hot-plug needs to take QOM design into account, and proposals parametrizing core count etc. make size precalculation tricky. Regards, Andreas --=20 SUSE Linux GmbH, Maxfeldstr. 5, 90409 N=FCrnberg, Germany GF: Felix Imend=F6rffer, Jane Smithard, Graham Norton; HRB 21284 (AG N=FC= rnberg)