[Qemu-devel] Error handling in realize() methods

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] Error handling in realize() methods
@ 2015-12-08 13:47 Markus Armbruster
  2015-12-08 14:19 ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 19+ messages in thread
From: Markus Armbruster @ 2015-12-08 13:47 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, Peter Crosthwaite, Andreas Färber,
	Peter Maydell

In general, code running withing a realize() method should not exit() on
error.  Instad, errors should be propagated through the realize()
method.  Additionally, the realize() method should fail cleanly,
i.e. carefully undo its side effects such as wiring of interrupts,
mapping of memory, and so forth.  Tedious work, but necessary to make
hot plug safe.

Quite a few devices don't do that.

Some of them can be usefully hot-plugged, and for them unclean failures
are simply bugs.  I'm going to mark the ones I can find.

Others are used only as onboard devices, and if their realize() fails,
the machine's init() will exit()[*].  In an ideal world, we'd start with
an empty board and cold-plugg devices, and there, clean failure may be
useful.  In the world we live in, making these devices fail cleanly is a
lot of tedious work for no immediate gain.

Example: machine "kzm" and device "fsl,imx31".  fsl_imx31_realize()
returns without cleanup on error.  kzm_init() exit(1)s when realize
fails, so the lack of cleanup is a non-issue.

I think this is basically okay for now, but I'd like us to mark these
devices cannot_instantiate_with_device_add_yet, with /* Reason:
realize() method fails uncleanly */.

Opinions?

Next, let's consider the special case "out of memory".

Our general approach is to treat it as immediately fatal.  This makes
sense, because when a smallish allocation fails, the process is almost
certainly doomed anyway.  Moreover, memory allocation is frequent, and
attempting to recover from failed memory allocation adds loads of
hard-to-test error paths.  These are *dangerous* unless carefully tested
(and we don't).

Certain important allocations we handle more gracefully.  For instance,
we don't want to die when the user tries to hot-plug more memory than we
can allocate, or tries to open a QCOW2 image with a huge L1 table.

Guest memory allocation used to have the "immediately fatal" policy
baked in at a fairly low level, but it's since been lifted into callers;
see commit c261d77..fc7a580 and fixups 4f96676..0bdaa3a.  During review
of the latter, Peter Crosthwaite called out the &error_fatal in the
realize methods and their supporting code.  I agreed with him back then
that the errors should really be propagated.  But now I've changed my
mind: I think we should treat these memory allocation failures like
we've always treated them, namely report and exit(1).  Except for
"large" allocations, where we have a higher probability of failure, and
a more realistic chance to recover safely.

Can we agree that passing &error_fatal to memory_region_init_ram() &
friends is basically okay even in realize() methods and their supporting
code?

[*] Well, the ones that bother to check for errors, but that's a
separate problem.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] Error handling in realize() methods
  2015-12-08 13:47 [Qemu-devel] Error handling in realize() methods Markus Armbruster
@ 2015-12-08 14:19 ` Dr. David Alan Gilbert
  2015-12-09  9:30   ` Markus Armbruster
  0 siblings, 1 reply; 19+ messages in thread
From: Dr. David Alan Gilbert @ 2015-12-08 14:19 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Peter Maydell, Paolo Bonzini, Peter Crosthwaite, qemu-devel,
	Andreas Färber

* Markus Armbruster (armbru@redhat.com) wrote:
> In general, code running withing a realize() method should not exit() on
> error.  Instad, errors should be propagated through the realize()
> method.  Additionally, the realize() method should fail cleanly,
> i.e. carefully undo its side effects such as wiring of interrupts,
> mapping of memory, and so forth.  Tedious work, but necessary to make
> hot plug safe.
> 
> Quite a few devices don't do that.
> 
> Some of them can be usefully hot-plugged, and for them unclean failures
> are simply bugs.  I'm going to mark the ones I can find.
> 
> Others are used only as onboard devices, and if their realize() fails,
> the machine's init() will exit()[*].  In an ideal world, we'd start with
> an empty board and cold-plugg devices, and there, clean failure may be
> useful.  In the world we live in, making these devices fail cleanly is a
> lot of tedious work for no immediate gain.
> 
> Example: machine "kzm" and device "fsl,imx31".  fsl_imx31_realize()
> returns without cleanup on error.  kzm_init() exit(1)s when realize
> fails, so the lack of cleanup is a non-issue.
> 
> I think this is basically okay for now, but I'd like us to mark these
> devices cannot_instantiate_with_device_add_yet, with /* Reason:
> realize() method fails uncleanly */.
> 
> Opinions?
> 
> Next, let's consider the special case "out of memory".
> 
> Our general approach is to treat it as immediately fatal.  This makes
> sense, because when a smallish allocation fails, the process is almost
> certainly doomed anyway.  Moreover, memory allocation is frequent, and
> attempting to recover from failed memory allocation adds loads of
> hard-to-test error paths.  These are *dangerous* unless carefully tested
> (and we don't).
> 
> Certain important allocations we handle more gracefully.  For instance,
> we don't want to die when the user tries to hot-plug more memory than we
> can allocate, or tries to open a QCOW2 image with a huge L1 table.
> 
> Guest memory allocation used to have the "immediately fatal" policy
> baked in at a fairly low level, but it's since been lifted into callers;
> see commit c261d77..fc7a580 and fixups 4f96676..0bdaa3a.  During review
> of the latter, Peter Crosthwaite called out the &error_fatal in the
> realize methods and their supporting code.  I agreed with him back then
> that the errors should really be propagated.  But now I've changed my
> mind: I think we should treat these memory allocation failures like
> we've always treated them, namely report and exit(1).  Except for
> "large" allocations, where we have a higher probability of failure, and
> a more realistic chance to recover safely.
> 
> Can we agree that passing &error_fatal to memory_region_init_ram() &
> friends is basically okay even in realize() methods and their supporting
> code?

I'd say it depends if they can be hotplugged; I think anything that we really
want to hotplug onto real running machines (as opposed to where we're just
hotplugging during setup) we should propagate errors properly.

And tbh I don't buy the small allocation argument; I think we should handle them
all; in my utopian world a guest wouldn't die unless there was no way out.

Dave

> 
> [*] Well, the ones that bother to check for errors, but that's a
> separate problem.
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] Error handling in realize() methods
  2015-12-08 14:19 ` Dr. David Alan Gilbert
@ 2015-12-09  9:30   ` Markus Armbruster
  2015-12-09 10:29     ` Dr. David Alan Gilbert
  2015-12-09 13:09     ` Paolo Bonzini
  0 siblings, 2 replies; 19+ messages in thread
From: Markus Armbruster @ 2015-12-09  9:30 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Peter Maydell, Peter Crosthwaite, qemu-devel, Andreas Färber,
	Paolo Bonzini

"Dr. David Alan Gilbert" <dgilbert@redhat.com> writes:

> * Markus Armbruster (armbru@redhat.com) wrote:
>> In general, code running withing a realize() method should not exit() on
>> error.  Instad, errors should be propagated through the realize()
>> method.  Additionally, the realize() method should fail cleanly,
>> i.e. carefully undo its side effects such as wiring of interrupts,
>> mapping of memory, and so forth.  Tedious work, but necessary to make
>> hot plug safe.
[...]
>> Next, let's consider the special case "out of memory".
>> 
>> Our general approach is to treat it as immediately fatal.  This makes
>> sense, because when a smallish allocation fails, the process is almost
>> certainly doomed anyway.  Moreover, memory allocation is frequent, and
>> attempting to recover from failed memory allocation adds loads of
>> hard-to-test error paths.  These are *dangerous* unless carefully tested
>> (and we don't).
>> 
>> Certain important allocations we handle more gracefully.  For instance,
>> we don't want to die when the user tries to hot-plug more memory than we
>> can allocate, or tries to open a QCOW2 image with a huge L1 table.
>> 
>> Guest memory allocation used to have the "immediately fatal" policy
>> baked in at a fairly low level, but it's since been lifted into callers;
>> see commit c261d77..fc7a580 and fixups 4f96676..0bdaa3a.  During review
>> of the latter, Peter Crosthwaite called out the &error_fatal in the
>> realize methods and their supporting code.  I agreed with him back then
>> that the errors should really be propagated.  But now I've changed my
>> mind: I think we should treat these memory allocation failures like
>> we've always treated them, namely report and exit(1).  Except for
>> "large" allocations, where we have a higher probability of failure, and
>> a more realistic chance to recover safely.
>> 
>> Can we agree that passing &error_fatal to memory_region_init_ram() &
>> friends is basically okay even in realize() methods and their supporting
>> code?
>
> I'd say it depends if they can be hotplugged; I think anything that we really
> want to hotplug onto real running machines (as opposed to where we're just
> hotplugging during setup) we should propagate errors properly.
>
> And tbh I don't buy the small allocation argument; I think we should handle them
> all; in my utopian world a guest wouldn't die unless there was no way out.

I guess in Utopia nobody ever makes stupid coding mistakes, the error
paths are all covered by a comprehensive test suite, and they make the
code prettier, too.  Oh, and kids always eat their vegetables without
complaint.

However, we don't actually live in Utopia.  In our world, error paths
clutter the code, are full of bugs, and the histogram of their execution
counts in testing (automated or not) has a frighteningly tall bar at
zero.

We're not going to make this problem less severe by making it bigger.
In fact, we consciously decided to hack off a big chunk with an axe:

commit 8a1d02aba9f986ca03d854184cd432ee98bcd179
Author: aliguori <aliguori@c046a42c-6fe2-441c-8c8c-71466251a162>
Date:   Thu Feb 5 22:05:49 2009 +0000

    Terminate emulation on memory allocation failure (Avi Kivity)

    Memory allocation failures are a very rare condition on virtual-memory
    hosts.  They are also very difficult to handle correctly (especially in a
    hardware emulation context).  Because of this, it is better to gracefully
    terminate emulation rather than executing untested or even unwritten recovery
    code paths.

    This patch changes the qemu memory allocation routines to terminate emulation
    if an allocation failure is encountered.

    Signed-off-by: Avi Kivity <avi@redhat.com>
    Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>

    git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6526 c046a42c-6fe2-441c-8c8c-71466251a162

Let me elaborate a bit on Avi's arguments:

* Memory allocations are very, very common.  I count at least 2500,
  Memory allocation failure is easily the most common *potential* error,
  both statically and dynamically.

* Error recovery is always tedious and often hard.  Especially when the
  error happens in the middle of a complex operation that needs to be
  fully rolled back to recover.  A sensible approach is to acquire
  resources early, when recovery is still relatively easy, but that's
  often impractical for memory.  This together with the ubiquitousness
  of memory allocation makes memory allocation failure even harder to
  handle than other kinds of errors.

* Not all allocations are equal.  When an attempt to allocate a gigabyte
  fails gracefully, there's a good chance that ordinary code can go on
  allocating and freeing kilobytes as usual.  But when an attempt to
  allocate kilobytes fails, it's very likely that handling this failure
  gracefully will only lead to another one, and another one, until some
  buggy error handling puts the doomed process out of its misery.

  Out of the ~2400 memory allocations that go through GLib, 59 can fail.
  The others all terminate the process.

* How often do we see failures from these other 2300+?  Bug reports from
  users?  As far as I can see, they're vanishingly rare.

* Reviewing and testing the error handling chains rooted at 59
  allocations is hard enough, and I don't think we're doing particularly
  well on the testing.  What chance would we have with 2300+ more?

  2300+ instances of a vanishingly rare error with generally complex
  error handling and basically no test coverage: does that sound more
  useful than 2300+ instances of a vanishingly rare error that kills the
  process?  If yes, how much of our limited resources is the difference
  worth?

* You might argue that we don't have to handle all 2300+ instances, only
  the ones reachable from hotplug.  I think that's a pipe dream.  Once
  you permit "terminate on memory allocation failure" in general purpose
  code, it hides behind innocent-looking function calls.  Even if we
  could find them all, we'd still have to add memory allocation failure
  handling to lots of general purpose code just to make it usable for
  hot plug.  And then we'd get to keep finding them all forever.

I don't think handling all memory allocation failures gracefully
everywhere or even just within hotplug is practical this side of Utopia.
I believe all we could achieve trying is an illusion of graceful
handling that is sufficiently convincing to let us pat on our backs,
call the job done, and go back to working on stuff that matters to
actual users.

My current working assumption is that passing &error_fatal to
memory_region_init_ram() & friends is okay even in realize() methods and
their supporting code, except when the allocation can be large.  Even
then, &error_fatal is better than buggy recovery code (which I can see
all over the place, but that's a separate topic).

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] Error handling in realize() methods
  2015-12-09  9:30   ` Markus Armbruster
@ 2015-12-09 10:29     ` Dr. David Alan Gilbert
  2015-12-09 11:10       ` Laszlo Ersek
                         ` (2 more replies)
  2015-12-09 13:09     ` Paolo Bonzini
  1 sibling, 3 replies; 19+ messages in thread
From: Dr. David Alan Gilbert @ 2015-12-09 10:29 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Peter Maydell, Peter Crosthwaite, qemu-devel, Andreas Färber,
	Paolo Bonzini

* Markus Armbruster (armbru@redhat.com) wrote:
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> writes:
> 
> > * Markus Armbruster (armbru@redhat.com) wrote:
> >> In general, code running withing a realize() method should not exit() on
> >> error.  Instad, errors should be propagated through the realize()
> >> method.  Additionally, the realize() method should fail cleanly,
> >> i.e. carefully undo its side effects such as wiring of interrupts,
> >> mapping of memory, and so forth.  Tedious work, but necessary to make
> >> hot plug safe.
> [...]
> >> Next, let's consider the special case "out of memory".
> >> 
> >> Our general approach is to treat it as immediately fatal.  This makes
> >> sense, because when a smallish allocation fails, the process is almost
> >> certainly doomed anyway.  Moreover, memory allocation is frequent, and
> >> attempting to recover from failed memory allocation adds loads of
> >> hard-to-test error paths.  These are *dangerous* unless carefully tested
> >> (and we don't).
> >> 
> >> Certain important allocations we handle more gracefully.  For instance,
> >> we don't want to die when the user tries to hot-plug more memory than we
> >> can allocate, or tries to open a QCOW2 image with a huge L1 table.
> >> 
> >> Guest memory allocation used to have the "immediately fatal" policy
> >> baked in at a fairly low level, but it's since been lifted into callers;
> >> see commit c261d77..fc7a580 and fixups 4f96676..0bdaa3a.  During review
> >> of the latter, Peter Crosthwaite called out the &error_fatal in the
> >> realize methods and their supporting code.  I agreed with him back then
> >> that the errors should really be propagated.  But now I've changed my
> >> mind: I think we should treat these memory allocation failures like
> >> we've always treated them, namely report and exit(1).  Except for
> >> "large" allocations, where we have a higher probability of failure, and
> >> a more realistic chance to recover safely.
> >> 
> >> Can we agree that passing &error_fatal to memory_region_init_ram() &
> >> friends is basically okay even in realize() methods and their supporting
> >> code?
> >
> > I'd say it depends if they can be hotplugged; I think anything that we really
> > want to hotplug onto real running machines (as opposed to where we're just
> > hotplugging during setup) we should propagate errors properly.
> >
> > And tbh I don't buy the small allocation argument; I think we should handle them
> > all; in my utopian world a guest wouldn't die unless there was no way out.
> 
> I guess in Utopia nobody ever makes stupid coding mistakes, the error
> paths are all covered by a comprehensive test suite, and they make the
> code prettier, too.  Oh, and kids always eat their vegetables without
> complaint.

Yes, it's lovely.

> However, we don't actually live in Utopia.  In our world, error paths
> clutter the code, are full of bugs, and the histogram of their execution
> counts in testing (automated or not) has a frighteningly tall bar at
> zero.
> 
> We're not going to make this problem less severe by making it bigger.
> In fact, we consciously decided to hack off a big chunk with an axe:
> 
> commit 8a1d02aba9f986ca03d854184cd432ee98bcd179
> Author: aliguori <aliguori@c046a42c-6fe2-441c-8c8c-71466251a162>
> Date:   Thu Feb 5 22:05:49 2009 +0000
> 
>     Terminate emulation on memory allocation failure (Avi Kivity)
>     
>     Memory allocation failures are a very rare condition on virtual-memory
>     hosts.  They are also very difficult to handle correctly (especially in a
>     hardware emulation context).  Because of this, it is better to gracefully
>     terminate emulation rather than executing untested or even unwritten recovery
>     code paths.
>     
>     This patch changes the qemu memory allocation routines to terminate emulation
>     if an allocation failure is encountered.
>     
>     Signed-off-by: Avi Kivity <avi@redhat.com>
>     Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
>     
>     
>     git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6526 c046a42c-6fe2-441c-8c8c-71466251a162
> 
> Let me elaborate a bit on Avi's arguments:
> 
> * Memory allocations are very, very common.  I count at least 2500,
>   Memory allocation failure is easily the most common *potential* error,
>   both statically and dynamically.
> 
> * Error recovery is always tedious and often hard.  Especially when the
>   error happens in the middle of a complex operation that needs to be
>   fully rolled back to recover.  A sensible approach is to acquire
>   resources early, when recovery is still relatively easy, but that's
>   often impractical for memory.  This together with the ubiquitousness
>   of memory allocation makes memory allocation failure even harder to
>   handle than other kinds of errors.
> 
> * Not all allocations are equal.  When an attempt to allocate a gigabyte
>   fails gracefully, there's a good chance that ordinary code can go on
>   allocating and freeing kilobytes as usual.  But when an attempt to
>   allocate kilobytes fails, it's very likely that handling this failure
>   gracefully will only lead to another one, and another one, until some
>   buggy error handling puts the doomed process out of its misery.
> 
>   Out of the ~2400 memory allocations that go through GLib, 59 can fail.
>   The others all terminate the process.
> 
> * How often do we see failures from these other 2300+?  Bug reports from
>   users?  As far as I can see, they're vanishingly rare.
> 
> * Reviewing and testing the error handling chains rooted at 59
>   allocations is hard enough, and I don't think we're doing particularly
>   well on the testing.  What chance would we have with 2300+ more?
> 
>   2300+ instances of a vanishingly rare error with generally complex
>   error handling and basically no test coverage: does that sound more
>   useful than 2300+ instances of a vanishingly rare error that kills the
>   process?  If yes, how much of our limited resources is the difference
>   worth?
> 
> * You might argue that we don't have to handle all 2300+ instances, only
>   the ones reachable from hotplug.  I think that's a pipe dream.  Once
>   you permit "terminate on memory allocation failure" in general purpose
>   code, it hides behind innocent-looking function calls.  Even if we
>   could find them all, we'd still have to add memory allocation failure
>   handling to lots of general purpose code just to make it usable for
>   hot plug.  And then we'd get to keep finding them all forever.

I didn't say it was easy :-)

> I don't think handling all memory allocation failures gracefully
> everywhere or even just within hotplug is practical this side of Utopia.
> I believe all we could achieve trying is an illusion of graceful
> handling that is sufficiently convincing to let us pat on our backs,
> call the job done, and go back to working on stuff that matters to
> actual users.

Handling them all probably isn't; handling some well defined cases is
probably possible.

Avi's argument is 6 years old, I suggest a few things have changed in that
time:
  a) We now use the Error** mechanism in a lot of places - so a lot of
code already is supposed to deal with a function call failing;  if a function
already has an error return and the caller deals with it, then making
the function deal with an allocation error and the caller handle it is
a lot easier.
  b) The use of hotplug is now common - people really hate it when their
nice, happy working VM dies when they try and do something to it, like
hotplug or migrate.
  c) I suspect (but don't know) that people are pushing the number of VMs on
a host harder than they used to, but there again memory got cheap.

I'm not that eager to protect every allocation; but in some common
cases, where we already have Error** paths and it's relatively simple,
then I think we should.

(OK, to be honest I think we should protect every allocation - but I do
have sympathy with the complexity/testing arguments).

Dave

> My current working assumption is that passing &error_fatal to
> memory_region_init_ram() & friends is okay even in realize() methods and
> their supporting code, except when the allocation can be large.  Even
> then, &error_fatal is better than buggy recovery code (which I can see
> all over the place, but that's a separate topic).


--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] Error handling in realize() methods
  2015-12-09 10:29     ` Dr. David Alan Gilbert
@ 2015-12-09 11:10       ` Laszlo Ersek
  2015-12-10  9:22         ` Markus Armbruster
  2015-12-09 11:47       ` Peter Maydell
  2015-12-10  9:27       ` Markus Armbruster
  2 siblings, 1 reply; 19+ messages in thread
From: Laszlo Ersek @ 2015-12-09 11:10 UTC (permalink / raw)
  To: Dr. David Alan Gilbert, Markus Armbruster
  Cc: Peter Maydell, Peter Crosthwaite, Paolo Bonzini, qemu-devel,
	Andreas Färber

On 12/09/15 11:29, Dr. David Alan Gilbert wrote:
> * Markus Armbruster (armbru@redhat.com) wrote:
>> "Dr. David Alan Gilbert" <dgilbert@redhat.com> writes:
>>
>>> * Markus Armbruster (armbru@redhat.com) wrote:
>>>> In general, code running withing a realize() method should not exit() on
>>>> error.  Instad, errors should be propagated through the realize()
>>>> method.  Additionally, the realize() method should fail cleanly,
>>>> i.e. carefully undo its side effects such as wiring of interrupts,
>>>> mapping of memory, and so forth.  Tedious work, but necessary to make
>>>> hot plug safe.
>> [...]
>>>> Next, let's consider the special case "out of memory".
>>>>
>>>> Our general approach is to treat it as immediately fatal.  This makes
>>>> sense, because when a smallish allocation fails, the process is almost
>>>> certainly doomed anyway.  Moreover, memory allocation is frequent, and
>>>> attempting to recover from failed memory allocation adds loads of
>>>> hard-to-test error paths.  These are *dangerous* unless carefully tested
>>>> (and we don't).
>>>>
>>>> Certain important allocations we handle more gracefully.  For instance,
>>>> we don't want to die when the user tries to hot-plug more memory than we
>>>> can allocate, or tries to open a QCOW2 image with a huge L1 table.
>>>>
>>>> Guest memory allocation used to have the "immediately fatal" policy
>>>> baked in at a fairly low level, but it's since been lifted into callers;
>>>> see commit c261d77..fc7a580 and fixups 4f96676..0bdaa3a.  During review
>>>> of the latter, Peter Crosthwaite called out the &error_fatal in the
>>>> realize methods and their supporting code.  I agreed with him back then
>>>> that the errors should really be propagated.  But now I've changed my
>>>> mind: I think we should treat these memory allocation failures like
>>>> we've always treated them, namely report and exit(1).  Except for
>>>> "large" allocations, where we have a higher probability of failure, and
>>>> a more realistic chance to recover safely.
>>>>
>>>> Can we agree that passing &error_fatal to memory_region_init_ram() &
>>>> friends is basically okay even in realize() methods and their supporting
>>>> code?
>>>
>>> I'd say it depends if they can be hotplugged; I think anything that we really
>>> want to hotplug onto real running machines (as opposed to where we're just
>>> hotplugging during setup) we should propagate errors properly.
>>>
>>> And tbh I don't buy the small allocation argument; I think we should handle them
>>> all; in my utopian world a guest wouldn't die unless there was no way out.
>>
>> I guess in Utopia nobody ever makes stupid coding mistakes, the error
>> paths are all covered by a comprehensive test suite, and they make the
>> code prettier, too.  Oh, and kids always eat their vegetables without
>> complaint.
> 
> Yes, it's lovely.
> 
>> However, we don't actually live in Utopia.  In our world, error paths
>> clutter the code, are full of bugs, and the histogram of their execution
>> counts in testing (automated or not) has a frighteningly tall bar at
>> zero.
>>
>> We're not going to make this problem less severe by making it bigger.
>> In fact, we consciously decided to hack off a big chunk with an axe:
>>
>> commit 8a1d02aba9f986ca03d854184cd432ee98bcd179
>> Author: aliguori <aliguori@c046a42c-6fe2-441c-8c8c-71466251a162>
>> Date:   Thu Feb 5 22:05:49 2009 +0000
>>
>>     Terminate emulation on memory allocation failure (Avi Kivity)
>>     
>>     Memory allocation failures are a very rare condition on virtual-memory
>>     hosts.  They are also very difficult to handle correctly (especially in a
>>     hardware emulation context).  Because of this, it is better to gracefully
>>     terminate emulation rather than executing untested or even unwritten recovery
>>     code paths.
>>     
>>     This patch changes the qemu memory allocation routines to terminate emulation
>>     if an allocation failure is encountered.
>>     
>>     Signed-off-by: Avi Kivity <avi@redhat.com>
>>     Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
>>     
>>     
>>     git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6526 c046a42c-6fe2-441c-8c8c-71466251a162
>>
>> Let me elaborate a bit on Avi's arguments:
>>
>> * Memory allocations are very, very common.  I count at least 2500,
>>   Memory allocation failure is easily the most common *potential* error,
>>   both statically and dynamically.
>>
>> * Error recovery is always tedious and often hard.  Especially when the
>>   error happens in the middle of a complex operation that needs to be
>>   fully rolled back to recover.  A sensible approach is to acquire
>>   resources early, when recovery is still relatively easy, but that's
>>   often impractical for memory.  This together with the ubiquitousness
>>   of memory allocation makes memory allocation failure even harder to
>>   handle than other kinds of errors.
>>
>> * Not all allocations are equal.  When an attempt to allocate a gigabyte
>>   fails gracefully, there's a good chance that ordinary code can go on
>>   allocating and freeing kilobytes as usual.  But when an attempt to
>>   allocate kilobytes fails, it's very likely that handling this failure
>>   gracefully will only lead to another one, and another one, until some
>>   buggy error handling puts the doomed process out of its misery.
>>
>>   Out of the ~2400 memory allocations that go through GLib, 59 can fail.
>>   The others all terminate the process.
>>
>> * How often do we see failures from these other 2300+?  Bug reports from
>>   users?  As far as I can see, they're vanishingly rare.
>>
>> * Reviewing and testing the error handling chains rooted at 59
>>   allocations is hard enough, and I don't think we're doing particularly
>>   well on the testing.  What chance would we have with 2300+ more?
>>
>>   2300+ instances of a vanishingly rare error with generally complex
>>   error handling and basically no test coverage: does that sound more
>>   useful than 2300+ instances of a vanishingly rare error that kills the
>>   process?  If yes, how much of our limited resources is the difference
>>   worth?
>>
>> * You might argue that we don't have to handle all 2300+ instances, only
>>   the ones reachable from hotplug.  I think that's a pipe dream.  Once
>>   you permit "terminate on memory allocation failure" in general purpose
>>   code, it hides behind innocent-looking function calls.  Even if we
>>   could find them all, we'd still have to add memory allocation failure
>>   handling to lots of general purpose code just to make it usable for
>>   hot plug.  And then we'd get to keep finding them all forever.
> 
> I didn't say it was easy :-)
> 
>> I don't think handling all memory allocation failures gracefully
>> everywhere or even just within hotplug is practical this side of Utopia.
>> I believe all we could achieve trying is an illusion of graceful
>> handling that is sufficiently convincing to let us pat on our backs,
>> call the job done, and go back to working on stuff that matters to
>> actual users.
> 
> Handling them all probably isn't; handling some well defined cases is
> probably possible.
> 
> Avi's argument is 6 years old, I suggest a few things have changed in that
> time:
>   a) We now use the Error** mechanism in a lot of places - so a lot of
> code already is supposed to deal with a function call failing;  if a function
> already has an error return and the caller deals with it, then making
> the function deal with an allocation error and the caller handle it is
> a lot easier.
>   b) The use of hotplug is now common - people really hate it when their
> nice, happy working VM dies when they try and do something to it, like
> hotplug or migrate.
>   c) I suspect (but don't know) that people are pushing the number of VMs on
> a host harder than they used to, but there again memory got cheap.
> 
> I'm not that eager to protect every allocation; but in some common
> cases, where we already have Error** paths and it's relatively simple,
> then I think we should.
> 
> (OK, to be honest I think we should protect every allocation - but I do
> have sympathy with the complexity/testing arguments).

I've been following this discussion with great interest.

My opinion should not be considered, because I won't be turning my
opinion into new code, or an agreement to support / maintain code. :)

My opinion is that
- every single allocation needs to be checked rigorously,
- any partial construction of a more complex object needs to be rolled
  back in precise reverse order upon encountering any kind of failure
  (not just allocation)
- this should occur regardless of testing coverage (although projects
  exist (for example, SQLite, IIRC) that use random or systematic
  malloc() error injection in their test suite, for good coverage)
- the primary requirements for this to work are:
  - a clear notion of ownership at any point in the code
  - a disciplined approach to ownership tracking; for example, a helper
    callee (responsible for constructing a member of a more complex
    object) is forbidden from releasing "sibling" resources allocated
    by the caller

This is possible to do (I'm doing it and enforcing it in OVMF), but it
takes a lot of discipline, and *historically* the QEMU codebase has
stunk, whenever I've looked at its ownership tracking during
construction of objects.

I feel that in the last sequence of months (years?) the developer
discipline and the codebase have improved a *great* deal. Still I cannot
say how feasible it would be to bring all existent code into conformance
with the above.

... As I said, I just wanted to express this opinion as another (not
really practical) data point. My children utterly hate spinach, so
Markus's counterpoint is definitely not lost on me.

Thanks
Laszlo

> 
> Dave
> 
>> My current working assumption is that passing &error_fatal to
>> memory_region_init_ram() & friends is okay even in realize() methods and
>> their supporting code, except when the allocation can be large.  Even
>> then, &error_fatal is better than buggy recovery code (which I can see
>> all over the place, but that's a separate topic).
> 
> 
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] Error handling in realize() methods
  2015-12-09 10:29     ` Dr. David Alan Gilbert
  2015-12-09 11:10       ` Laszlo Ersek
@ 2015-12-09 11:47       ` Peter Maydell
  2015-12-09 12:25         ` Laszlo Ersek
  2015-12-09 13:21         ` Dr. David Alan Gilbert
  2015-12-10  9:27       ` Markus Armbruster
  2 siblings, 2 replies; 19+ messages in thread
From: Peter Maydell @ 2015-12-09 11:47 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Paolo Bonzini, Peter Crosthwaite, Markus Armbruster,
	Andreas Färber, QEMU Developers

On 9 December 2015 at 10:29, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
> (OK, to be honest I think we should protect every allocation - but I do
> have sympathy with the complexity/testing arguments).

My view on this is that Linux overcommits, so the actual likely
way that "oops, out of memory" will manifest is by some page not
being able to be allocated-on-demand, at which point your process
is toast anyway. Checking malloc returns is really only checking
your virtual address space allocation, which typically speaking
always succeeds, except in the "we tried to get gigabytes at
once" case...

thanks
-- PMM

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] Error handling in realize() methods
  2015-12-09 11:47       ` Peter Maydell
@ 2015-12-09 12:25         ` Laszlo Ersek
  2015-12-09 13:21         ` Dr. David Alan Gilbert
  1 sibling, 0 replies; 19+ messages in thread
From: Laszlo Ersek @ 2015-12-09 12:25 UTC (permalink / raw)
  To: Peter Maydell, Dr. David Alan Gilbert
  Cc: QEMU Developers, Paolo Bonzini, Peter Crosthwaite,
	Markus Armbruster, Andreas Färber

On 12/09/15 12:47, Peter Maydell wrote:
> On 9 December 2015 at 10:29, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
>> (OK, to be honest I think we should protect every allocation - but I do
>> have sympathy with the complexity/testing arguments).
> 
> My view on this is that Linux overcommits, so the actual likely
> way that "oops, out of memory" will manifest is by some page not
> being able to be allocated-on-demand, at which point your process
> is toast anyway.

This is a frequent argument, but this behavior is not acceptable for
many production workloads (for example, it is not acceptable on my
laptop :)), and it can be disabled with the overcommit knobs under
/proc/sys/vm.

Also, the OOM killer destroying the "wrong" process is a recurrent
source of fun.

Of course, under memory pressure there is no telling which process will
get ENOMEM first, even with overcommit disabled, but that setting at
least allows the process to deal with the error, should it choose so.

On a heavy duty virtualization host, one can expect that *only* such
applications run that can "behave" in response to ENOMEM -- I believe
libvirtd is an example? --; hence disabling overcommit is especially valid.

(I continue to offer my (unsolicited) opinion in this thread, and I
continue encouraging QEMU people to ignore it whenever appropriate. I
don't strive to influence the decision; I'd just like to contribute to
the well-informed nature of the decision.)

Thanks
Laszlo

> Checking malloc returns is really only checking
> your virtual address space allocation, which typically speaking
> always succeeds, except in the "we tried to get gigabytes at
> once" case...
> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] Error handling in realize() methods
  2015-12-09  9:30   ` Markus Armbruster
  2015-12-09 10:29     ` Dr. David Alan Gilbert
@ 2015-12-09 13:09     ` Paolo Bonzini
  2015-12-09 13:12       ` Dr. David Alan Gilbert
  2015-12-10 11:06       ` Markus Armbruster
  1 sibling, 2 replies; 19+ messages in thread
From: Paolo Bonzini @ 2015-12-09 13:09 UTC (permalink / raw)
  To: Markus Armbruster, Dr. David Alan Gilbert
  Cc: Peter Maydell, Peter Crosthwaite, qemu-devel, Andreas Färber

On 09/12/2015 10:30, Markus Armbruster wrote:
> My current working assumption is that passing &error_fatal to
> memory_region_init_ram() & friends is okay even in realize() methods and
> their supporting code, except when the allocation can be large.

I suspect a lot of memory_region_init_ram()s could be considered
potentially large (at least in the 16-64 megabytes range).  Propagation
of memory_region_init_ram() failures is easy enough, thanks to Error**,
that we should just do it.

Even if we don't, we should use &error_abort, not &error_fatal
(programmer error---due to laziness---rather than user error).
&error_fatal should really be restricted to code that is running very
close to main().

Paolo

> Even
> then, &error_fatal is better than buggy recovery code (which I can see
> all over the place, but that's a separate topic).

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] Error handling in realize() methods
  2015-12-09 13:09     ` Paolo Bonzini
@ 2015-12-09 13:12       ` Dr. David Alan Gilbert
  2015-12-09 13:43         ` Paolo Bonzini
  2015-12-10 11:06       ` Markus Armbruster
  1 sibling, 1 reply; 19+ messages in thread
From: Dr. David Alan Gilbert @ 2015-12-09 13:12 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Peter Maydell, Peter Crosthwaite, Markus Armbruster,
	Andreas Färber, qemu-devel

* Paolo Bonzini (pbonzini@redhat.com) wrote:
> 
> 
> On 09/12/2015 10:30, Markus Armbruster wrote:
> > My current working assumption is that passing &error_fatal to
> > memory_region_init_ram() & friends is okay even in realize() methods and
> > their supporting code, except when the allocation can be large.
> 
> I suspect a lot of memory_region_init_ram()s could be considered
> potentially large (at least in the 16-64 megabytes range).  Propagation
> of memory_region_init_ram() failures is easy enough, thanks to Error**,
> that we should just do it.
> 
> Even if we don't, we should use &error_abort, not &error_fatal
> (programmer error---due to laziness---rather than user error).
> &error_fatal should really be restricted to code that is running very
> close to main().

No, we used to have error_abort and changed them out for error_fatal because
we were getting flooded with crash reports due to the aborts of people trying
to run VMs too big for their machine.

Dave

> 
> Paolo
> 
> > Even
> > then, &error_fatal is better than buggy recovery code (which I can see
> > all over the place, but that's a separate topic).
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] Error handling in realize() methods
  2015-12-09 11:47       ` Peter Maydell
  2015-12-09 12:25         ` Laszlo Ersek
@ 2015-12-09 13:21         ` Dr. David Alan Gilbert
  1 sibling, 0 replies; 19+ messages in thread
From: Dr. David Alan Gilbert @ 2015-12-09 13:21 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Paolo Bonzini, Peter Crosthwaite, Markus Armbruster,
	Andreas Färber, QEMU Developers

* Peter Maydell (peter.maydell@linaro.org) wrote:
> On 9 December 2015 at 10:29, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
> > (OK, to be honest I think we should protect every allocation - but I do
> > have sympathy with the complexity/testing arguments).
> 
> My view on this is that Linux overcommits, so the actual likely
> way that "oops, out of memory" will manifest is by some page not
> being able to be allocated-on-demand, at which point your process
> is toast anyway. Checking malloc returns is really only checking
> your virtual address space allocation, which typically speaking
> always succeeds, except in the "we tried to get gigabytes at
> once" case...

We already get some failures, e.g. qemu-system-x86_64 -m 8192
fails with an allocation error on my laptop (8GB RAM+2GB swap).
I'm also not sure whether your statement is true once things like
cgroup/ulimit memory restrictions are used and/or mlock.

People really really hate OOM behaviour in production systems
and jump through hoops to try and avoid it.

Dave

> 
> thanks
> -- PMM
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] Error handling in realize() methods
  2015-12-09 13:12       ` Dr. David Alan Gilbert
@ 2015-12-09 13:43         ` Paolo Bonzini
  0 siblings, 0 replies; 19+ messages in thread
From: Paolo Bonzini @ 2015-12-09 13:43 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: qemu-devel, Peter Maydell, Peter Crosthwaite, Markus Armbruster,
	Andreas Färber



On 09/12/2015 14:12, Dr. David Alan Gilbert wrote:
>> > Even if we don't, we should use &error_abort, not &error_fatal
>> > (programmer error---due to laziness---rather than user error).
>> > &error_fatal should really be restricted to code that is running very
>> > close to main().
> No, we used to have error_abort and changed them out for error_fatal because
> we were getting flooded with crash reports due to the aborts of people trying
> to run VMs too big for their machine.

That's a different call site, it's memory_region_allocate_system_memory
and it currently does a manual error_report_err+exit(1).  That one is
okay, because it's indeed running "very close to main()" (it's called by
machine_class->init, which is called by main).  It could be kept
open-coded or changed to error_fatal.

Paolo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] Error handling in realize() methods
  2015-12-09 11:10       ` Laszlo Ersek
@ 2015-12-10  9:22         ` Markus Armbruster
  2015-12-10 11:10           ` Laszlo Ersek
  0 siblings, 1 reply; 19+ messages in thread
From: Markus Armbruster @ 2015-12-10  9:22 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: Peter Maydell, qemu-devel, Dr. David Alan Gilbert,
	Peter Crosthwaite, Paolo Bonzini, Andreas Färber

Laszlo Ersek <lersek@redhat.com> writes:

> I've been following this discussion with great interest.
>
> My opinion should not be considered, because I won't be turning my
> opinion into new code, or an agreement to support / maintain code. :)
>
> My opinion is that
> - every single allocation needs to be checked rigorously,
> - any partial construction of a more complex object needs to be rolled
>   back in precise reverse order upon encountering any kind of failure
>   (not just allocation)
> - this should occur regardless of testing coverage (although projects
>   exist (for example, SQLite, IIRC) that use random or systematic
>   malloc() error injection in their test suite, for good coverage)
> - the primary requirements for this to work are:
>   - a clear notion of ownership at any point in the code
>   - a disciplined approach to ownership tracking; for example, a helper
>     callee (responsible for constructing a member of a more complex
>     object) is forbidden from releasing "sibling" resources allocated
>     by the caller
>
> This is possible to do (I'm doing it and enforcing it in OVMF), but it
> takes a lot of discipline, and *historically* the QEMU codebase has
> stunk, whenever I've looked at its ownership tracking during
> construction of objects.

Doing it from the start is one thing.  Laudable in theory, justifiable
in practice for some applications (I've seen it done for a satellite's
avionics software), but in general, life's too short for that kind of
manual error handling.

Retrofitting it into a big & messy existing code base is another thing
entirely.  Believe me, I've done my share of cleaning up stuff.  When
you're tempted to mess with something that works (although you barely
understand how or even why) to make it neater, or easier to maintain, or
handle vanishingly rare errors more gracefully, the one thing you need
to know is when *not* to do it, because the risk of you fscking it up
outweighs whatever you hope to gain.

> I feel that in the last sequence of months (years?) the developer
> discipline and the codebase have improved a *great* deal. Still I cannot
> say how feasible it would be to bring all existent code into conformance
> with the above.

Hah!  We can't even get all the existent qdev init() methods converted
to realize(), which is a *much* simpler task.  And that conveniently
ignores all the code that hasn't even made it to qdev.

We got bigger fish to fry.  In fact, the queue of fish waiting to be
fried goes along the block a couple of times.

> ... As I said, I just wanted to express this opinion as another (not
> really practical) data point. My children utterly hate spinach, so
> Markus's counterpoint is definitely not lost on me.

To anyone advocating creating thousands (or even dozens) of non-trivial
new error paths: come back with a test suite.  We have mountains of
evidence for chronic inability to get error paths right.  In fact, I'm
working on a patch that adds a bunch of FIXMEs for one class of flawed
error recovery.  Just FIXMEs, because I don't dare fixing the bugs
myself without a way to test the fix, and the fishes in the queue are
looking at me accusingly already.

If I may suggest a logo for such a test suite: kids eating spinach with
a smile feel apt.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] Error handling in realize() methods
  2015-12-09 10:29     ` Dr. David Alan Gilbert
  2015-12-09 11:10       ` Laszlo Ersek
  2015-12-09 11:47       ` Peter Maydell
@ 2015-12-10  9:27       ` Markus Armbruster
  2 siblings, 0 replies; 19+ messages in thread
From: Markus Armbruster @ 2015-12-10  9:27 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Peter Maydell, Peter Crosthwaite, Paolo Bonzini, qemu-devel,
	Andreas Färber

"Dr. David Alan Gilbert" <dgilbert@redhat.com> writes:

> * Markus Armbruster (armbru@redhat.com) wrote:
>> "Dr. David Alan Gilbert" <dgilbert@redhat.com> writes:
>> 
>> > * Markus Armbruster (armbru@redhat.com) wrote:
>> >> In general, code running withing a realize() method should not exit() on
>> >> error.  Instad, errors should be propagated through the realize()
>> >> method.  Additionally, the realize() method should fail cleanly,
>> >> i.e. carefully undo its side effects such as wiring of interrupts,
>> >> mapping of memory, and so forth.  Tedious work, but necessary to make
>> >> hot plug safe.
>> [...]
>> >> Next, let's consider the special case "out of memory".
>> >> 
>> >> Our general approach is to treat it as immediately fatal.  This makes
>> >> sense, because when a smallish allocation fails, the process is almost
>> >> certainly doomed anyway.  Moreover, memory allocation is frequent, and
>> >> attempting to recover from failed memory allocation adds loads of
>> >> hard-to-test error paths.  These are *dangerous* unless carefully tested
>> >> (and we don't).
>> >> 
>> >> Certain important allocations we handle more gracefully.  For instance,
>> >> we don't want to die when the user tries to hot-plug more memory than we
>> >> can allocate, or tries to open a QCOW2 image with a huge L1 table.
>> >> 
>> >> Guest memory allocation used to have the "immediately fatal" policy
>> >> baked in at a fairly low level, but it's since been lifted into callers;
>> >> see commit c261d77..fc7a580 and fixups 4f96676..0bdaa3a.  During review
>> >> of the latter, Peter Crosthwaite called out the &error_fatal in the
>> >> realize methods and their supporting code.  I agreed with him back then
>> >> that the errors should really be propagated.  But now I've changed my
>> >> mind: I think we should treat these memory allocation failures like
>> >> we've always treated them, namely report and exit(1).  Except for
>> >> "large" allocations, where we have a higher probability of failure, and
>> >> a more realistic chance to recover safely.
>> >> 
>> >> Can we agree that passing &error_fatal to memory_region_init_ram() &
>> >> friends is basically okay even in realize() methods and their supporting
>> >> code?
>> >
>> > I'd say it depends if they can be hotplugged; I think anything
>> > that we really
>> > want to hotplug onto real running machines (as opposed to where we're just
>> > hotplugging during setup) we should propagate errors properly.
>> >
>> > And tbh I don't buy the small allocation argument; I think we
>> > should handle them
>> > all; in my utopian world a guest wouldn't die unless there was no way out.
>> 
>> I guess in Utopia nobody ever makes stupid coding mistakes, the error
>> paths are all covered by a comprehensive test suite, and they make the
>> code prettier, too.  Oh, and kids always eat their vegetables without
>> complaint.
>
> Yes, it's lovely.
>
>> However, we don't actually live in Utopia.  In our world, error paths
>> clutter the code, are full of bugs, and the histogram of their execution
>> counts in testing (automated or not) has a frighteningly tall bar at
>> zero.
>> 
>> We're not going to make this problem less severe by making it bigger.
>> In fact, we consciously decided to hack off a big chunk with an axe:
>> 
>> commit 8a1d02aba9f986ca03d854184cd432ee98bcd179
>> Author: aliguori <aliguori@c046a42c-6fe2-441c-8c8c-71466251a162>
>> Date:   Thu Feb 5 22:05:49 2009 +0000
>> 
>>     Terminate emulation on memory allocation failure (Avi Kivity)
>>     
>>     Memory allocation failures are a very rare condition on virtual-memory
>>     hosts.  They are also very difficult to handle correctly (especially in a
>>     hardware emulation context).  Because of this, it is better to gracefully
>>     terminate emulation rather than executing untested or even unwritten recovery
>>     code paths.
>>     
>>     This patch changes the qemu memory allocation routines to terminate emulation
>>     if an allocation failure is encountered.
>>     
>>     Signed-off-by: Avi Kivity <avi@redhat.com>
>>     Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
>>     
>>     
>>     git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6526 c046a42c-6fe2-441c-8c8c-71466251a162
>> 
>> Let me elaborate a bit on Avi's arguments:
>> 
>> * Memory allocations are very, very common.  I count at least 2500,
>>   Memory allocation failure is easily the most common *potential* error,
>>   both statically and dynamically.
>> 
>> * Error recovery is always tedious and often hard.  Especially when the
>>   error happens in the middle of a complex operation that needs to be
>>   fully rolled back to recover.  A sensible approach is to acquire
>>   resources early, when recovery is still relatively easy, but that's
>>   often impractical for memory.  This together with the ubiquitousness
>>   of memory allocation makes memory allocation failure even harder to
>>   handle than other kinds of errors.
>> 
>> * Not all allocations are equal.  When an attempt to allocate a gigabyte
>>   fails gracefully, there's a good chance that ordinary code can go on
>>   allocating and freeing kilobytes as usual.  But when an attempt to
>>   allocate kilobytes fails, it's very likely that handling this failure
>>   gracefully will only lead to another one, and another one, until some
>>   buggy error handling puts the doomed process out of its misery.
>> 
>>   Out of the ~2400 memory allocations that go through GLib, 59 can fail.
>>   The others all terminate the process.
>> 
>> * How often do we see failures from these other 2300+?  Bug reports from
>>   users?  As far as I can see, they're vanishingly rare.
>> 
>> * Reviewing and testing the error handling chains rooted at 59
>>   allocations is hard enough, and I don't think we're doing particularly
>>   well on the testing.  What chance would we have with 2300+ more?
>> 
>>   2300+ instances of a vanishingly rare error with generally complex
>>   error handling and basically no test coverage: does that sound more
>>   useful than 2300+ instances of a vanishingly rare error that kills the
>>   process?  If yes, how much of our limited resources is the difference
>>   worth?
>> 
>> * You might argue that we don't have to handle all 2300+ instances, only
>>   the ones reachable from hotplug.  I think that's a pipe dream.  Once
>>   you permit "terminate on memory allocation failure" in general purpose
>>   code, it hides behind innocent-looking function calls.  Even if we
>>   could find them all, we'd still have to add memory allocation failure
>>   handling to lots of general purpose code just to make it usable for
>>   hot plug.  And then we'd get to keep finding them all forever.
>
> I didn't say it was easy :-)
>
>> I don't think handling all memory allocation failures gracefully
>> everywhere or even just within hotplug is practical this side of Utopia.
>> I believe all we could achieve trying is an illusion of graceful
>> handling that is sufficiently convincing to let us pat on our backs,
>> call the job done, and go back to working on stuff that matters to
>> actual users.
>
> Handling them all probably isn't; handling some well defined cases is
> probably possible.
>
> Avi's argument is 6 years old, I suggest a few things have changed in that
> time:
>   a) We now use the Error** mechanism in a lot of places - so a lot of
> code already is supposed to deal with a function call failing;  if a function
> already has an error return and the caller deals with it, then making
> the function deal with an allocation error and the caller handle it is
> a lot easier.

You still have to get the error path within the failing function right,
and tested.

>   b) The use of hotplug is now common - people really hate it when their
> nice, happy working VM dies when they try and do something to it, like
> hotplug or migrate.
>   c) I suspect (but don't know) that people are pushing the number of VMs on
> a host harder than they used to, but there again memory got cheap.
>
> I'm not that eager to protect every allocation; but in some common
> cases, where we already have Error** paths and it's relatively simple,
> then I think we should.

I won't argue we can't do more than 59.

I'm working on a patch adding FIXMEs to a few functions whose error
paths look relatively simple, but are in fact absolutely wrong.

> (OK, to be honest I think we should protect every allocation - but I do
> have sympathy with the complexity/testing arguments).
>
> Dave
>
>> My current working assumption is that passing &error_fatal to
>> memory_region_init_ram() & friends is okay even in realize() methods and
>> their supporting code, except when the allocation can be large.  Even
>> then, &error_fatal is better than buggy recovery code (which I can see
>> all over the place, but that's a separate topic).
>
>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] Error handling in realize() methods
  2015-12-09 13:09     ` Paolo Bonzini
  2015-12-09 13:12       ` Dr. David Alan Gilbert
@ 2015-12-10 11:06       ` Markus Armbruster
  2015-12-10 11:21         ` Dr. David Alan Gilbert
  2015-12-10 11:26         ` Paolo Bonzini
  1 sibling, 2 replies; 19+ messages in thread
From: Markus Armbruster @ 2015-12-10 11:06 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Peter Maydell, Peter Crosthwaite, Dr. David Alan Gilbert,
	Andreas Färber, qemu-devel

Paolo Bonzini <pbonzini@redhat.com> writes:

> On 09/12/2015 10:30, Markus Armbruster wrote:
>> My current working assumption is that passing &error_fatal to
>> memory_region_init_ram() & friends is okay even in realize() methods and
>> their supporting code, except when the allocation can be large.
>
> I suspect a lot of memory_region_init_ram()s could be considered
> potentially large (at least in the 16-64 megabytes range).  Propagation
> of memory_region_init_ram() failures is easy enough, thanks to Error**,
> that we should just do it.

Propagating an out-of-memory error right in realize() is easy.  What's
not so easy is making realize() fail cleanly (all side effects undone;
we get that wrong in many places), and finding and propagating
out-of-memory errors hiding deeper in the call tree.

However, genuinely "large" allocations should be relatively few, and
handling them gracefully in hot-pluggable devices is probably feasible.

I doubt ensuring *all* allocations on behalf of a hot-pluggable device
are handled gracefully is a good use of our reseources, or even
feasible.

Likewise, graceful error handling for devices that cannot be hot-plugged
feels like a waste of resources we can ill afford.  I think we should
simply document their non-gracefulness by either setting hotpluggable =
false or cannot_instantiate_with_device_add_yet = true with a suitable
comment.

> Even if we don't, we should use &error_abort, not &error_fatal
> (programmer error---due to laziness---rather than user error).
> &error_fatal should really be restricted to code that is running very
> close to main().

"Very close to main" is a question of dynamic context.

Consider a device that can only be created during machine initialization
(cannot_instantiate_with_device_add_yet = true or hotpluggable = false).
&error_fatal is perfectly adequate there.  &error_abort would be
misleading, because when it fails, it's almost certainly because the
user tried to create too big a machine.

Now consider a hot-pluggable device.  Its recognized "large" allocations
all fail gracefully.  What about its other allocations?  Two kinds: the
ones visible in the device model code, and the ones hiding elsewhere,
which include "a few" of the 2300+ uses of GLib memory allocation.  The
latter exit().  Why should the former abort()?

Now use that hot-pluggable device during machine initialization.
abort() is again misleading.

Let's avoid a fruitless debate on when to exit() and when to abort() on
out-of-memory, and just stick to exit().  We don't need a core dump to
tell a developer to fix his lazy error handling.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] Error handling in realize() methods
  2015-12-10  9:22         ` Markus Armbruster
@ 2015-12-10 11:10           ` Laszlo Ersek
  0 siblings, 0 replies; 19+ messages in thread
From: Laszlo Ersek @ 2015-12-10 11:10 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Peter Maydell, qemu-devel, Dr. David Alan Gilbert,
	Peter Crosthwaite, Paolo Bonzini, Andreas Färber

On 12/10/15 10:22, Markus Armbruster wrote:
> Laszlo Ersek <lersek@redhat.com> writes:
> 
>> I've been following this discussion with great interest.
>>
>> My opinion should not be considered, because I won't be turning my
>> opinion into new code, or an agreement to support / maintain code. :)
>>
>> My opinion is that
>> - every single allocation needs to be checked rigorously,
>> - any partial construction of a more complex object needs to be rolled
>>   back in precise reverse order upon encountering any kind of failure
>>   (not just allocation)
>> - this should occur regardless of testing coverage (although projects
>>   exist (for example, SQLite, IIRC) that use random or systematic
>>   malloc() error injection in their test suite, for good coverage)
>> - the primary requirements for this to work are:
>>   - a clear notion of ownership at any point in the code
>>   - a disciplined approach to ownership tracking; for example, a helper
>>     callee (responsible for constructing a member of a more complex
>>     object) is forbidden from releasing "sibling" resources allocated
>>     by the caller
>>
>> This is possible to do (I'm doing it and enforcing it in OVMF), but it
>> takes a lot of discipline, and *historically* the QEMU codebase has
>> stunk, whenever I've looked at its ownership tracking during
>> construction of objects.
> 
> Doing it from the start is one thing.  Laudable in theory, justifiable
> in practice for some applications (I've seen it done for a satellite's
> avionics software), but in general, life's too short for that kind of
> manual error handling.
> 
> Retrofitting it into a big & messy existing code base is another thing
> entirely.  Believe me, I've done my share of cleaning up stuff.  When
> you're tempted to mess with something that works (although you barely
> understand how or even why) to make it neater, or easier to maintain, or
> handle vanishingly rare errors more gracefully, the one thing you need
> to know is when *not* to do it, because the risk of you fscking it up
> outweighs whatever you hope to gain.

I agree. :)

Thanks
Laszlo

> 
>> I feel that in the last sequence of months (years?) the developer
>> discipline and the codebase have improved a *great* deal. Still I cannot
>> say how feasible it would be to bring all existent code into conformance
>> with the above.
> 
> Hah!  We can't even get all the existent qdev init() methods converted
> to realize(), which is a *much* simpler task.  And that conveniently
> ignores all the code that hasn't even made it to qdev.
> 
> We got bigger fish to fry.  In fact, the queue of fish waiting to be
> fried goes along the block a couple of times.
> 
>> ... As I said, I just wanted to express this opinion as another (not
>> really practical) data point. My children utterly hate spinach, so
>> Markus's counterpoint is definitely not lost on me.
> 
> To anyone advocating creating thousands (or even dozens) of non-trivial
> new error paths: come back with a test suite.  We have mountains of
> evidence for chronic inability to get error paths right.  In fact, I'm
> working on a patch that adds a bunch of FIXMEs for one class of flawed
> error recovery.  Just FIXMEs, because I don't dare fixing the bugs
> myself without a way to test the fix, and the fishes in the queue are
> looking at me accusingly already.
> 
> If I may suggest a logo for such a test suite: kids eating spinach with
> a smile feel apt.
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] Error handling in realize() methods
  2015-12-10 11:06       ` Markus Armbruster
@ 2015-12-10 11:21         ` Dr. David Alan Gilbert
  2015-12-10 11:22           ` Paolo Bonzini
  2015-12-10 11:26         ` Paolo Bonzini
  1 sibling, 1 reply; 19+ messages in thread
From: Dr. David Alan Gilbert @ 2015-12-10 11:21 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Paolo Bonzini, Peter Crosthwaite, qemu-devel, Andreas Färber,
	Peter Maydell

* Markus Armbruster (armbru@redhat.com) wrote:
> Paolo Bonzini <pbonzini@redhat.com> writes:
> 
> > On 09/12/2015 10:30, Markus Armbruster wrote:
> >> My current working assumption is that passing &error_fatal to
> >> memory_region_init_ram() & friends is okay even in realize() methods and
> >> their supporting code, except when the allocation can be large.
> >
> > I suspect a lot of memory_region_init_ram()s could be considered
> > potentially large (at least in the 16-64 megabytes range).  Propagation
> > of memory_region_init_ram() failures is easy enough, thanks to Error**,
> > that we should just do it.
> 
> Propagating an out-of-memory error right in realize() is easy.  What's
> not so easy is making realize() fail cleanly (all side effects undone;
> we get that wrong in many places), and finding and propagating
> out-of-memory errors hiding deeper in the call tree.
> 
> However, genuinely "large" allocations should be relatively few, and
> handling them gracefully in hot-pluggable devices is probably feasible.
> 
> I doubt ensuring *all* allocations on behalf of a hot-pluggable device
> are handled gracefully is a good use of our reseources, or even
> feasible.
> 
> Likewise, graceful error handling for devices that cannot be hot-plugged
> feels like a waste of resources we can ill afford.  I think we should
> simply document their non-gracefulness by either setting hotpluggable =
> false or cannot_instantiate_with_device_add_yet = true with a suitable
> comment.
> 
> > Even if we don't, we should use &error_abort, not &error_fatal
> > (programmer error---due to laziness---rather than user error).
> > &error_fatal should really be restricted to code that is running very
> > close to main().
> 
> "Very close to main" is a question of dynamic context.
> 
> Consider a device that can only be created during machine initialization
> (cannot_instantiate_with_device_add_yet = true or hotpluggable = false).
> &error_fatal is perfectly adequate there.  &error_abort would be
> misleading, because when it fails, it's almost certainly because the
> user tried to create too big a machine.
> 
> Now consider a hot-pluggable device.  Its recognized "large" allocations
> all fail gracefully.  What about its other allocations?  Two kinds: the
> ones visible in the device model code, and the ones hiding elsewhere,
> which include "a few" of the 2300+ uses of GLib memory allocation.  The
> latter exit().  Why should the former abort()?
> 
> Now use that hot-pluggable device during machine initialization.
> abort() is again misleading.
> 
> Let's avoid a fruitless debate on when to exit() and when to abort() on
> out-of-memory, and just stick to exit().  We don't need a core dump to
> tell a developer to fix his lazy error handling.

The tricky bit is when a user says 'it crashed with out of memory' -
and we just did an exit, how do we find out which bit we should
improve the error handling on?  I guess the use of abort() could tell us
that - however it's a really big assumption that in an OOM case we'd
be able to dump the information.

Dave

--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] Error handling in realize() methods
  2015-12-10 11:21         ` Dr. David Alan Gilbert
@ 2015-12-10 11:22           ` Paolo Bonzini
  0 siblings, 0 replies; 19+ messages in thread
From: Paolo Bonzini @ 2015-12-10 11:22 UTC (permalink / raw)
  To: Dr. David Alan Gilbert, Markus Armbruster
  Cc: Peter Maydell, Peter Crosthwaite, qemu-devel, Andreas Färber

On 10/12/2015 12:21, Dr. David Alan Gilbert wrote:
> I guess the use of abort() could tell us
> that - however it's a really big assumption that in an OOM case we'd
> be able to dump the information.

If it's not OOM, but just a multi-gigabyte allocation, we should.

Paolo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] Error handling in realize() methods
  2015-12-10 11:06       ` Markus Armbruster
  2015-12-10 11:21         ` Dr. David Alan Gilbert
@ 2015-12-10 11:26         ` Paolo Bonzini
  2015-12-10 12:25           ` Markus Armbruster
  1 sibling, 1 reply; 19+ messages in thread
From: Paolo Bonzini @ 2015-12-10 11:26 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Peter Maydell, Peter Crosthwaite, Dr. David Alan Gilbert,
	Andreas Färber, qemu-devel

On 10/12/2015 12:06, Markus Armbruster wrote:
> Paolo Bonzini <pbonzini@redhat.com> writes:
> 
>> On 09/12/2015 10:30, Markus Armbruster wrote:
>>> My current working assumption is that passing &error_fatal to
>>> memory_region_init_ram() & friends is okay even in realize() methods and
>>> their supporting code, except when the allocation can be large.
>>
>> I suspect a lot of memory_region_init_ram()s could be considered
>> potentially large (at least in the 16-64 megabytes range).  Propagation
>> of memory_region_init_ram() failures is easy enough, thanks to Error**,
>> that we should just do it.
> 
> Propagating an out-of-memory error right in realize() is easy.  What's
> not so easy is making realize() fail cleanly (all side effects undone;
> we get that wrong in many places), and finding and propagating
> out-of-memory errors hiding deeper in the call tree.

grep is your friend.  We're talking of a subset of these:

hw/block/onenand.c:    memory_region_init_ram(&s->ram, OBJECT(s), "onenand.ram",
hw/block/pflash_cfi01.c:    memory_region_init_rom_device(
hw/block/pflash_cfi02.c:    memory_region_init_rom_device(&pfl->orig_mem, OBJECT(pfl), pfl->be ?
hw/display/cg3.c:    memory_region_init_ram(&s->rom, obj, "cg3.prom", FCODE_MAX_ROM_SIZE,
hw/display/cg3.c:    memory_region_init_ram(&s->vram_mem, NULL, "cg3.vram", s->vram_size,
hw/display/g364fb.c:    memory_region_init_ram_ptr(&s->mem_vram, NULL, "vram",
hw/display/qxl.c:    memory_region_init_ram(&qxl->rom_bar, OBJECT(qxl), "qxl.vrom",
hw/display/qxl.c:    memory_region_init_ram(&qxl->vram_bar, OBJECT(qxl), "qxl.vram",
hw/display/qxl.c:    memory_region_init_ram(&qxl->vga.vram, OBJECT(dev), "qxl.vgavram",
hw/display/sm501.c:    memory_region_init_ram(&s->local_mem_region, NULL, "sm501.local",
hw/display/tc6393xb.c:    memory_region_init_ram(&s->vram, NULL, "tc6393xb.vram", 0x100000,
hw/display/tcx.c:    memory_region_init_ram(&s->rom, obj, "tcx.prom", FCODE_MAX_ROM_SIZE,
hw/display/tcx.c:    memory_region_init_ram(&s->vram_mem, OBJECT(s), "tcx.vram",
hw/display/vga.c:    memory_region_init_ram(&s->vram, obj, "vga.vram", s->vram_size,
hw/display/vmware_vga.c:    memory_region_init_ram(&s->fifo_ram, NULL, "vmsvga.fifo", s->fifo_size,
hw/dma/rc4030.c:    memory_region_init_rom_device(&s->dma_tt, o,
hw/i386/pci-assign-load-rom.c:    memory_region_init_ram(&dev->rom, owner, name, st.st_size, &error_abort);
hw/input/milkymist-softusb.c:    memory_region_init_ram(&s->pmem, OBJECT(s), "milkymist-softusb.pmem",
hw/input/milkymist-softusb.c:    memory_region_init_ram(&s->dmem, OBJECT(s), "milkymist-softusb.dmem",
hw/net/dp8393x.c:    memory_region_init_ram(&s->prom, OBJECT(dev),
hw/net/milkymist-minimac2.c:    memory_region_init_ram(&s->buffers, OBJECT(dev), "milkymist-minimac2.buffers",
hw/pci/pci.c:    memory_region_init_ram(&pdev->rom, OBJECT(pdev), name, size, &error_fatal);
hw/s390x/sclp.c:            memory_region_init_ram(standby_ram, NULL, id, this_subregion_size,

Some of them are in sysbus devices and can thus be ignored.  Some of the
others may indeed need introduction of additional error propagation or may
be non-trivial.

> I doubt ensuring *all* allocations on behalf of a hot-pluggable device
> are handled gracefully is a good use of our reseources, or even
> feasible.

I agree.  I don't think we have large allocations in device models,
other than from memory region allocation (e.g. VRAM).

> Consider a device that can only be created during machine initialization
> (cannot_instantiate_with_device_add_yet = true or hotpluggable = false).
> &error_fatal is perfectly adequate there.  &error_abort would be
> misleading, because when it fails, it's almost certainly because the
> user tried to create too big a machine.

I agree here.

> Now consider a hot-pluggable device.  Its recognized "large" allocations
> all fail gracefully.  What about its other allocations?  Two kinds: the
> ones visible in the device model code, and the ones hiding elsewhere,
> which include "a few" of the 2300+ uses of GLib memory allocation.  The
> latter exit().  Why should the former abort()?

We've already ruled that the latter simply won't happen, so we can say at
the same time that g_malloc() does not exit at all. :)

&error_fatal is a user error, &error_abort is an assertion failure.  This
is much closer to an assertion failure.

> Now use that hot-pluggable device during machine initialization.
> abort() is again misleading.

Perhaps, but---because it's hotpluggable---what happens during machine
initialization is much less important.  It doesn't lead to lost data.
An abort() is much more likely to catch the developers' and the users'
attention and much more likely to be reported (compare "huh, it went
away?!?" with "abrt is asking me to open a Fedora bug, interesting").

> Let's avoid a fruitless debate on when to exit() and when to abort() on
> out-of-memory, and just stick to exit().  We don't need a core dump to
> tell a developer to fix his lazy error handling.

Perhaps we do...

Paolo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] Error handling in realize() methods
  2015-12-10 11:26         ` Paolo Bonzini
@ 2015-12-10 12:25           ` Markus Armbruster
  0 siblings, 0 replies; 19+ messages in thread
From: Markus Armbruster @ 2015-12-10 12:25 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: qemu-devel, Peter Maydell, Peter Crosthwaite,
	Dr. David Alan Gilbert, Andreas Färber

Paolo Bonzini <pbonzini@redhat.com> writes:

> On 10/12/2015 12:06, Markus Armbruster wrote:
>> Paolo Bonzini <pbonzini@redhat.com> writes:
>> 
>>> On 09/12/2015 10:30, Markus Armbruster wrote:
>>>> My current working assumption is that passing &error_fatal to
>>>> memory_region_init_ram() & friends is okay even in realize() methods and
>>>> their supporting code, except when the allocation can be large.
>>>
>>> I suspect a lot of memory_region_init_ram()s could be considered
>>> potentially large (at least in the 16-64 megabytes range).  Propagation
>>> of memory_region_init_ram() failures is easy enough, thanks to Error**,
>>> that we should just do it.
>> 
>> Propagating an out-of-memory error right in realize() is easy.  What's
>> not so easy is making realize() fail cleanly (all side effects undone;
>> we get that wrong in many places), and finding and propagating
>> out-of-memory errors hiding deeper in the call tree.
>
> grep is your friend.  We're talking of a subset of these:
[...]

Yes, finding just the guest memory allocations isn't hard.  But making
them fail cleanly is, judging from the code that gets it wrong.

[...]

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2015-12-10 12:25 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-12-08 13:47 [Qemu-devel] Error handling in realize() methods Markus Armbruster
2015-12-08 14:19 ` Dr. David Alan Gilbert
2015-12-09  9:30   ` Markus Armbruster
2015-12-09 10:29     ` Dr. David Alan Gilbert
2015-12-09 11:10       ` Laszlo Ersek
2015-12-10  9:22         ` Markus Armbruster
2015-12-10 11:10           ` Laszlo Ersek
2015-12-09 11:47       ` Peter Maydell
2015-12-09 12:25         ` Laszlo Ersek
2015-12-09 13:21         ` Dr. David Alan Gilbert
2015-12-10  9:27       ` Markus Armbruster
2015-12-09 13:09     ` Paolo Bonzini
2015-12-09 13:12       ` Dr. David Alan Gilbert
2015-12-09 13:43         ` Paolo Bonzini
2015-12-10 11:06       ` Markus Armbruster
2015-12-10 11:21         ` Dr. David Alan Gilbert
2015-12-10 11:22           ` Paolo Bonzini
2015-12-10 11:26         ` Paolo Bonzini
2015-12-10 12:25           ` Markus Armbruster

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).