* [Qemu-devel] [PATCH 0/2] qdev: fail safely if can't allocate device in device_add() @ 2015-12-18 15:30 Igor Mammedov 2015-12-18 15:30 ` [Qemu-devel] [PATCH 1/2] qom: add object_class_get_instance_size() Igor Mammedov 2015-12-18 15:30 ` [Qemu-devel] [PATCH 2/2] qdev: safely fail device_add if unable to allocate device Igor Mammedov 0 siblings, 2 replies; 9+ messages in thread From: Igor Mammedov @ 2015-12-18 15:30 UTC (permalink / raw) To: qemu-devel; +Cc: pbonzini, afaerber, armbru For hotplug it's important not to crash guest if hotplug is not possible. This series make 'device_add FOO,...' fail safely if there isn't enough memory for FOO device instance. Igor Mammedov (2): qom: add object_class_get_instance_size() qdev: safely fail device_add if unable to allocate device include/qom/object.h | 8 ++++++++ qdev-monitor.c | 9 ++++++++- qom/object.c | 5 +++++ 3 files changed, 21 insertions(+), 1 deletion(-) -- 1.8.3.1 ^ permalink raw reply [flat|nested] 9+ messages in thread
* [Qemu-devel] [PATCH 1/2] qom: add object_class_get_instance_size() 2015-12-18 15:30 [Qemu-devel] [PATCH 0/2] qdev: fail safely if can't allocate device in device_add() Igor Mammedov @ 2015-12-18 15:30 ` Igor Mammedov 2016-01-11 15:37 ` Andreas Färber 2015-12-18 15:30 ` [Qemu-devel] [PATCH 2/2] qdev: safely fail device_add if unable to allocate device Igor Mammedov 1 sibling, 1 reply; 9+ messages in thread From: Igor Mammedov @ 2015-12-18 15:30 UTC (permalink / raw) To: qemu-devel; +Cc: pbonzini, afaerber, armbru it will be used for allocating memory for a to be created new object in safe manner. Signed-off-by: Igor Mammedov <imammedo@redhat.com> --- include/qom/object.h | 8 ++++++++ qom/object.c | 5 +++++ 2 files changed, 13 insertions(+) diff --git a/include/qom/object.h b/include/qom/object.h index 4509166..8f61b0b 100644 --- a/include/qom/object.h +++ b/include/qom/object.h @@ -880,6 +880,14 @@ bool object_class_is_abstract(ObjectClass *klass); */ ObjectClass *object_class_by_name(const char *typename); +/** + * object_class_get_instance_size: + * @klass: The #ObjectClass to obtain instance size + * + * Returns: instance size of @klass + */ +size_t object_class_get_instance_size(ObjectClass *klass); + void object_class_foreach(void (*fn)(ObjectClass *klass, void *opaque), const char *implements_type, bool include_abstract, void *opaque); diff --git a/qom/object.c b/qom/object.c index d751569..2f141ee 100644 --- a/qom/object.c +++ b/qom/object.c @@ -771,6 +771,11 @@ ObjectClass *object_class_get_parent(ObjectClass *class) return type->class; } +size_t object_class_get_instance_size(ObjectClass *klass) +{ + return klass->type->instance_size; +} + typedef struct OCFData { void (*fn)(ObjectClass *klass, void *opaque); -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [PATCH 1/2] qom: add object_class_get_instance_size() 2015-12-18 15:30 ` [Qemu-devel] [PATCH 1/2] qom: add object_class_get_instance_size() Igor Mammedov @ 2016-01-11 15:37 ` Andreas Färber 0 siblings, 0 replies; 9+ messages in thread From: Andreas Färber @ 2016-01-11 15:37 UTC (permalink / raw) To: Igor Mammedov, qemu-devel; +Cc: pbonzini, armbru Am 18.12.2015 um 16:30 schrieb Igor Mammedov: > it will be used for allocating memory for a to be > created new object in safe manner. > > Signed-off-by: Igor Mammedov <imammedo@redhat.com> > --- > include/qom/object.h | 8 ++++++++ > qom/object.c | 5 +++++ > 2 files changed, 13 insertions(+) > > diff --git a/include/qom/object.h b/include/qom/object.h > index 4509166..8f61b0b 100644 > --- a/include/qom/object.h > +++ b/include/qom/object.h > @@ -880,6 +880,14 @@ bool object_class_is_abstract(ObjectClass *klass); > */ > ObjectClass *object_class_by_name(const char *typename); > > +/** > + * object_class_get_instance_size: > + * @klass: The #ObjectClass to obtain instance size > + * > + * Returns: instance size of @klass > + */ > +size_t object_class_get_instance_size(ObjectClass *klass); > + > void object_class_foreach(void (*fn)(ObjectClass *klass, void *opaque), > const char *implements_type, bool include_abstract, > void *opaque); > diff --git a/qom/object.c b/qom/object.c > index d751569..2f141ee 100644 > --- a/qom/object.c > +++ b/qom/object.c > @@ -771,6 +771,11 @@ ObjectClass *object_class_get_parent(ObjectClass *class) > return type->class; > } > > +size_t object_class_get_instance_size(ObjectClass *klass) > +{ > + return klass->type->instance_size; > +} > + > typedef struct OCFData > { > void (*fn)(ObjectClass *klass, void *opaque); I had previously proposed a type_get_instance_size() but did not get any feedback: http://patchwork.ozlabs.org/patch/269591/ It was using name -> type, here you are using class -> type. One does not necessarily exclude the other. :) Regards, Andreas -- SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Felix Imendörffer, Jane Smithard, Graham Norton; HRB 21284 (AG Nürnberg) ^ permalink raw reply [flat|nested] 9+ messages in thread
* [Qemu-devel] [PATCH 2/2] qdev: safely fail device_add if unable to allocate device 2015-12-18 15:30 [Qemu-devel] [PATCH 0/2] qdev: fail safely if can't allocate device in device_add() Igor Mammedov 2015-12-18 15:30 ` [Qemu-devel] [PATCH 1/2] qom: add object_class_get_instance_size() Igor Mammedov @ 2015-12-18 15:30 ` Igor Mammedov 2015-12-18 16:48 ` Daniel P. Berrange 1 sibling, 1 reply; 9+ messages in thread From: Igor Mammedov @ 2015-12-18 15:30 UTC (permalink / raw) To: qemu-devel; +Cc: pbonzini, afaerber, armbru qdev_device_add() currently uses object_new() which will abort if there memory allocation for device instance fails. While it's fine it startup, it is not desirable diring hotplug. Try to allocate memory for object first and fail safely if allocation fails. Signed-off-by: Igor Mammedov <imammedo@redhat.com> --- It's just a step in making hotplug safer wrt object allocation. To make it more safer, hotplugged class constructor shouldn't allocate memory either, but that should be addressed on per device basis providing we fix QOM internals to avoid dynamic allocations. --- qdev-monitor.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/qdev-monitor.c b/qdev-monitor.c index a35098f..a70262e 100644 --- a/qdev-monitor.c +++ b/qdev-monitor.c @@ -514,6 +514,7 @@ DeviceState *qdev_device_add(QemuOpts *opts, Error **errp) DeviceClass *dc; const char *driver, *path, *id; DeviceState *dev; + size_t obj_size; BusState *bus = NULL; Error *err = NULL; @@ -555,7 +556,13 @@ DeviceState *qdev_device_add(QemuOpts *opts, Error **errp) } /* create device */ - dev = DEVICE(object_new(driver)); + obj_size = object_class_get_instance_size(OBJECT_CLASS(dc)); + dev = g_try_malloc0(obj_size); + if (dev == NULL) { + error_setg(errp, "Not enough memory for Device '%s'", driver); + return NULL; + } + object_initialize(dev, obj_size, driver); if (bus) { qdev_set_parent_bus(dev, bus); -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [PATCH 2/2] qdev: safely fail device_add if unable to allocate device 2015-12-18 15:30 ` [Qemu-devel] [PATCH 2/2] qdev: safely fail device_add if unable to allocate device Igor Mammedov @ 2015-12-18 16:48 ` Daniel P. Berrange 2015-12-18 17:26 ` Eric Blake 0 siblings, 1 reply; 9+ messages in thread From: Daniel P. Berrange @ 2015-12-18 16:48 UTC (permalink / raw) To: Igor Mammedov; +Cc: pbonzini, qemu-devel, armbru, afaerber On Fri, Dec 18, 2015 at 04:30:47PM +0100, Igor Mammedov wrote: > qdev_device_add() currently uses object_new() which > will abort if there memory allocation for device instance > fails. While it's fine it startup, it is not desirable > diring hotplug. > > Try to allocate memory for object first and fail safely > if allocation fails. > > Signed-off-by: Igor Mammedov <imammedo@redhat.com> > --- > It's just a step in making hotplug safer wrt object allocation. > To make it more safer, hotplugged class constructor > shouldn't allocate memory either, but that should be > addressed on per device basis providing we fix QOM > internals to avoid dynamic allocations. > --- > qdev-monitor.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/qdev-monitor.c b/qdev-monitor.c > index a35098f..a70262e 100644 > --- a/qdev-monitor.c > +++ b/qdev-monitor.c > @@ -514,6 +514,7 @@ DeviceState *qdev_device_add(QemuOpts *opts, Error **errp) > DeviceClass *dc; > const char *driver, *path, *id; > DeviceState *dev; > + size_t obj_size; > BusState *bus = NULL; > Error *err = NULL; > > @@ -555,7 +556,13 @@ DeviceState *qdev_device_add(QemuOpts *opts, Error **errp) > } > > /* create device */ > - dev = DEVICE(object_new(driver)); > + obj_size = object_class_get_instance_size(OBJECT_CLASS(dc)); > + dev = g_try_malloc0(obj_size); > + if (dev == NULL) { > + error_setg(errp, "Not enough memory for Device '%s'", driver); > + return NULL; > + } This just avoids one small malloc failure. > + object_initialize(dev, obj_size, driver); This is going to call g_new many more times, so you'll still hit OOM almost immediately. eg the call to g_hash_table_new_full() in object_initialize_with_type will abort on OOM, not to mention anything run in a instance constructor function registered against the class. There's no way to avoid this given that we have chosen to use GLib in QEMU, so I don't really see any point in replacing the 'object_new' call with g_try_malloc Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [PATCH 2/2] qdev: safely fail device_add if unable to allocate device 2015-12-18 16:48 ` Daniel P. Berrange @ 2015-12-18 17:26 ` Eric Blake 2015-12-18 21:15 ` Markus Armbruster 0 siblings, 1 reply; 9+ messages in thread From: Eric Blake @ 2015-12-18 17:26 UTC (permalink / raw) To: Daniel P. Berrange, Igor Mammedov; +Cc: pbonzini, qemu-devel, afaerber, armbru [-- Attachment #1: Type: text/plain, Size: 1988 bytes --] On 12/18/2015 09:48 AM, Daniel P. Berrange wrote: > On Fri, Dec 18, 2015 at 04:30:47PM +0100, Igor Mammedov wrote: >> qdev_device_add() currently uses object_new() which >> will abort if there memory allocation for device instance >> fails. While it's fine it startup, it is not desirable >> diring hotplug. >> >> Try to allocate memory for object first and fail safely >> if allocation fails. >> > This just avoids one small malloc failure. > >> + object_initialize(dev, obj_size, driver); > > This is going to call g_new many more times, so you'll > still hit OOM almost immediately. eg the call to > g_hash_table_new_full() in object_initialize_with_type > will abort on OOM, not to mention anything run in a > instance constructor function registered against the > class. There's no way to avoid this given that we have > chosen to use GLib in QEMU, so I don't really see any > point in replacing the 'object_new' call with g_try_malloc We just had a recent thread on it, and I tend to agree that this series isn't helpful. https://lists.gnu.org/archive/html/qemu-devel/2015-12/threads.html#01238 If the allocation is BIG (as in one megabyte or more), then doing tentative allocation and reporting failure is useful (we still have enough small memory left to meaningfully report the error). But here, objects are unlikely to require a megabyte each (and if you really do have an obj_size that large, given that it is a sizeof(type), we may have other problems on our hands - what does the .h look like that defines a type that big, and how slow does it compile?). We can assume that small allocations will always succeed, because if they fail, even the mere act of trying to gracefully recover and report that failure is likely to malloc more memory, which will recursively fail and hose us worse than exiting right away. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 604 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [PATCH 2/2] qdev: safely fail device_add if unable to allocate device 2015-12-18 17:26 ` Eric Blake @ 2015-12-18 21:15 ` Markus Armbruster 2016-01-11 16:04 ` Andreas Färber 0 siblings, 1 reply; 9+ messages in thread From: Markus Armbruster @ 2015-12-18 21:15 UTC (permalink / raw) To: Eric Blake; +Cc: Igor Mammedov, qemu-devel, afaerber, pbonzini Eric Blake <eblake@redhat.com> writes: > On 12/18/2015 09:48 AM, Daniel P. Berrange wrote: >> On Fri, Dec 18, 2015 at 04:30:47PM +0100, Igor Mammedov wrote: >>> qdev_device_add() currently uses object_new() which >>> will abort if there memory allocation for device instance >>> fails. While it's fine it startup, it is not desirable >>> diring hotplug. >>> >>> Try to allocate memory for object first and fail safely >>> if allocation fails. >>> > >> This just avoids one small malloc failure. >> >>> + object_initialize(dev, obj_size, driver); >> >> This is going to call g_new many more times, so you'll >> still hit OOM almost immediately. eg the call to >> g_hash_table_new_full() in object_initialize_with_type >> will abort on OOM, not to mention anything run in a >> instance constructor function registered against the >> class. There's no way to avoid this given that we have >> chosen to use GLib in QEMU, so I don't really see any >> point in replacing the 'object_new' call with g_try_malloc Seconded. > We just had a recent thread on it, and I tend to agree that this series > isn't helpful. > > https://lists.gnu.org/archive/html/qemu-devel/2015-12/threads.html#01238 Plenty of arguments there why recovering from scattered random allocation failures isn't useful, and recovering from all of them isn't practical. Limit recovery attempts to big allocations, and spend (some of) the saved cycles on actually testing the recovery code. [...] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [PATCH 2/2] qdev: safely fail device_add if unable to allocate device 2015-12-18 21:15 ` Markus Armbruster @ 2016-01-11 16:04 ` Andreas Färber 2016-01-12 15:43 ` Daniel P. Berrange 0 siblings, 1 reply; 9+ messages in thread From: Andreas Färber @ 2016-01-11 16:04 UTC (permalink / raw) To: Markus Armbruster, Eric Blake, Igor Mammedov; +Cc: pbonzini, qemu-devel Am 18.12.2015 um 22:15 schrieb Markus Armbruster: > Eric Blake <eblake@redhat.com> writes: >> On 12/18/2015 09:48 AM, Daniel P. Berrange wrote: >>> On Fri, Dec 18, 2015 at 04:30:47PM +0100, Igor Mammedov wrote: >>>> qdev_device_add() currently uses object_new() which >>>> will abort if there memory allocation for device instance >>>> fails. While it's fine it startup, it is not desirable >>>> diring hotplug. >>>> >>>> Try to allocate memory for object first and fail safely >>>> if allocation fails. >> >>> This just avoids one small malloc failure. >>> >>>> + object_initialize(dev, obj_size, driver); >>> >>> This is going to call g_new many more times, so you'll >>> still hit OOM almost immediately. eg the call to >>> g_hash_table_new_full() in object_initialize_with_type >>> will abort on OOM, not to mention anything run in a >>> instance constructor function registered against the >>> class. There's no way to avoid this given that we have >>> chosen to use GLib in QEMU, so I don't really see any >>> point in replacing the 'object_new' call with g_try_malloc > > Seconded. > >> We just had a recent thread on it, and I tend to agree that this series >> isn't helpful. >> >> https://lists.gnu.org/archive/html/qemu-devel/2015-12/threads.html#01238 > > Plenty of arguments there why recovering from scattered random > allocation failures isn't useful, and recovering from all of them isn't > practical. Limit recovery attempts to big allocations, and spend (some > of) the saved cycles on actually testing the recovery code. Jumping in late... I had done work into this direction around 2012/2013 and I believe that changing the hotplug allocation is the right thing to do here. http://patchwork.ozlabs.org/patch/269595/ Igor's error handling could still use some improvement (at the time we did not have the out argument) and my suggested solution was a similar one to error_abort (pre-allocation and special handling), to avoid allocation of Error objects and strings on OOM. Daniel, any allocations other than the core QOM API inside an instance_init function are plain wrong, has been pointed out to patch authors and is not an argument for not handling hot-plug errors/design properly. My huge QOM conversions always had in mind that instance_init cannot do random allocations and moved them to realize, which may fail, including for OOM, and should be handled gracefully for hot-plug. For startup I don't see it as critical - it may be inconvenient not to get a nice error message but you don't risk losing the guest's data. Now to the claim that this is just a small allocation. If some 20-char string allocation fails, there's little QEMU can realistically do recovery-wise. But a PowerPCCPU device for instance comes with huge multi-level instruction-dispatch tables even in KVM mode. Therefore my opinion that this user-initiated scenario and thus code location is exactly the one we need to handle, even if many others remain unhandled. It's also why CPU hot-plug needs to take QOM design into account, and proposals parametrizing core count etc. make size precalculation tricky. Regards, Andreas -- SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Felix Imendörffer, Jane Smithard, Graham Norton; HRB 21284 (AG Nürnberg) ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [PATCH 2/2] qdev: safely fail device_add if unable to allocate device 2016-01-11 16:04 ` Andreas Färber @ 2016-01-12 15:43 ` Daniel P. Berrange 0 siblings, 0 replies; 9+ messages in thread From: Daniel P. Berrange @ 2016-01-12 15:43 UTC (permalink / raw) To: Andreas Färber Cc: qemu-devel, Igor Mammedov, Markus Armbruster, pbonzini On Mon, Jan 11, 2016 at 05:04:01PM +0100, Andreas Färber wrote: > Am 18.12.2015 um 22:15 schrieb Markus Armbruster: > > Eric Blake <eblake@redhat.com> writes: > >> On 12/18/2015 09:48 AM, Daniel P. Berrange wrote: > >>> On Fri, Dec 18, 2015 at 04:30:47PM +0100, Igor Mammedov wrote: > >>>> qdev_device_add() currently uses object_new() which > >>>> will abort if there memory allocation for device instance > >>>> fails. While it's fine it startup, it is not desirable > >>>> diring hotplug. > >>>> > >>>> Try to allocate memory for object first and fail safely > >>>> if allocation fails. > >> > >>> This just avoids one small malloc failure. > >>> > >>>> + object_initialize(dev, obj_size, driver); > >>> > >>> This is going to call g_new many more times, so you'll > >>> still hit OOM almost immediately. eg the call to > >>> g_hash_table_new_full() in object_initialize_with_type > >>> will abort on OOM, not to mention anything run in a > >>> instance constructor function registered against the > >>> class. There's no way to avoid this given that we have > >>> chosen to use GLib in QEMU, so I don't really see any > >>> point in replacing the 'object_new' call with g_try_malloc > > > > Seconded. > > > >> We just had a recent thread on it, and I tend to agree that this series > >> isn't helpful. > >> > >> https://lists.gnu.org/archive/html/qemu-devel/2015-12/threads.html#01238 > > > > Plenty of arguments there why recovering from scattered random > > allocation failures isn't useful, and recovering from all of them isn't > > practical. Limit recovery attempts to big allocations, and spend (some > > of) the saved cycles on actually testing the recovery code. > > Jumping in late... I had done work into this direction around 2012/2013 > and I believe that changing the hotplug allocation is the right thing to > do here. > > http://patchwork.ozlabs.org/patch/269595/ > > Igor's error handling could still use some improvement (at the time we > did not have the out argument) and my suggested solution was a similar > one to error_abort (pre-allocation and special handling), to avoid > allocation of Error objects and strings on OOM. > > Daniel, any allocations other than the core QOM API inside an > instance_init function are plain wrong, has been pointed out to patch > authors and is not an argument for not handling hot-plug errors/design > properly. My huge QOM conversions always had in mind that instance_init > cannot do random allocations and moved them to realize, which may fail, > including for OOM, and should be handled gracefully for hot-plug. For > startup I don't see it as critical - it may be inconvenient not to get a > nice error message but you don't risk losing the guest's data. > > Now to the claim that this is just a small allocation. If some 20-char > string allocation fails, there's little QEMU can realistically do > recovery-wise. But a PowerPCCPU device for instance comes with huge > multi-level instruction-dispatch tables even in KVM mode. Therefore my > opinion that this user-initiated scenario and thus code location is > exactly the one we need to handle, even if many others remain unhandled. > It's also why CPU hot-plug needs to take QOM design into account, and > proposals parametrizing core count etc. make size precalculation tricky. I'm still don't think that trying to recover from OOM on device hotplug is going to be usable/useful in practice. Thinking only about QEMU, and not the rest of the OS, we already have many other threads running in parallel which are potentially allocating memory too. If we're close enough to the limit that a device hotplug triggers OOM, we have to consider that another thread may trigger OOM concurrently. Even if the device hotplug allocation is moderately large, we may succeeed in allocating the large chunk, only to pushed so close to the limit, that any other small allocation just after then pushes us into OOM. So even coping with this one failure is not going to make the hotplug code robust in general - it is just giving a false sense of reliability IMHO. If people are running QEMU on hosts that are so heavily overcommit on RAM that a device hotplug pushes QEMU into OOM, they're pretty much doomed regardless of whether we try to handle te one allocation failure shown in this patch. I just don't see anything useful coming from trying to handle allocation failure here, given that we've decided to accept abort-on-OOM as the general policy throughout QEMU. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2016-01-12 15:43 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-12-18 15:30 [Qemu-devel] [PATCH 0/2] qdev: fail safely if can't allocate device in device_add() Igor Mammedov 2015-12-18 15:30 ` [Qemu-devel] [PATCH 1/2] qom: add object_class_get_instance_size() Igor Mammedov 2016-01-11 15:37 ` Andreas Färber 2015-12-18 15:30 ` [Qemu-devel] [PATCH 2/2] qdev: safely fail device_add if unable to allocate device Igor Mammedov 2015-12-18 16:48 ` Daniel P. Berrange 2015-12-18 17:26 ` Eric Blake 2015-12-18 21:15 ` Markus Armbruster 2016-01-11 16:04 ` Andreas Färber 2016-01-12 15:43 ` Daniel P. Berrange
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).