From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:53400)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <lersek@redhat.com>) id 1a6z7Q-0007BA-Ow
	for qemu-devel@nongnu.org; Thu, 10 Dec 2015 06:10:41 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <lersek@redhat.com>) id 1a6z7L-0002GP-Dx
	for qemu-devel@nongnu.org; Thu, 10 Dec 2015 06:10:40 -0500
Received: from mx1.redhat.com ([209.132.183.28]:60809)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <lersek@redhat.com>) id 1a6z7L-0002GG-03
	for qemu-devel@nongnu.org; Thu, 10 Dec 2015 06:10:35 -0500
References: <87a8pl9hmt.fsf@blackfin.pond.sub.org>
	<20151208141938.GB2593@work-vm> <87io480y0n.fsf@blackfin.pond.sub.org>
	<20151209102931.GC2712@work-vm> <56680C1C.1030604@redhat.com>
	<87io46mzda.fsf@blackfin.pond.sub.org>
From: Laszlo Ersek <lersek@redhat.com>
Message-ID: <56695DA7.2030908@redhat.com>
Date: Thu, 10 Dec 2015 12:10:31 +0100
MIME-Version: 1.0
In-Reply-To: <87io46mzda.fsf@blackfin.pond.sub.org>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] Error handling in realize() methods
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Markus Armbruster <armbru@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>, qemu-devel@nongnu.org, "Dr. David Alan Gilbert" <dgilbert@redhat.com>, Peter Crosthwaite <crosthwaitepeter@gmail.com>, Paolo Bonzini <pbonzini@redhat.com>, =?UTF-8?Q?Andreas_F=c3=a4rber?= <afaerber@suse.de>

On 12/10/15 10:22, Markus Armbruster wrote:
> Laszlo Ersek <lersek@redhat.com> writes:
> 
>> I've been following this discussion with great interest.
>>
>> My opinion should not be considered, because I won't be turning my
>> opinion into new code, or an agreement to support / maintain code. :)
>>
>> My opinion is that
>> - every single allocation needs to be checked rigorously,
>> - any partial construction of a more complex object needs to be rolled
>>   back in precise reverse order upon encountering any kind of failure
>>   (not just allocation)
>> - this should occur regardless of testing coverage (although projects
>>   exist (for example, SQLite, IIRC) that use random or systematic
>>   malloc() error injection in their test suite, for good coverage)
>> - the primary requirements for this to work are:
>>   - a clear notion of ownership at any point in the code
>>   - a disciplined approach to ownership tracking; for example, a helper
>>     callee (responsible for constructing a member of a more complex
>>     object) is forbidden from releasing "sibling" resources allocated
>>     by the caller
>>
>> This is possible to do (I'm doing it and enforcing it in OVMF), but it
>> takes a lot of discipline, and *historically* the QEMU codebase has
>> stunk, whenever I've looked at its ownership tracking during
>> construction of objects.
> 
> Doing it from the start is one thing.  Laudable in theory, justifiable
> in practice for some applications (I've seen it done for a satellite's
> avionics software), but in general, life's too short for that kind of
> manual error handling.
> 
> Retrofitting it into a big & messy existing code base is another thing
> entirely.  Believe me, I've done my share of cleaning up stuff.  When
> you're tempted to mess with something that works (although you barely
> understand how or even why) to make it neater, or easier to maintain, or
> handle vanishingly rare errors more gracefully, the one thing you need
> to know is when *not* to do it, because the risk of you fscking it up
> outweighs whatever you hope to gain.

I agree. :)

Thanks
Laszlo

> 
>> I feel that in the last sequence of months (years?) the developer
>> discipline and the codebase have improved a *great* deal. Still I cannot
>> say how feasible it would be to bring all existent code into conformance
>> with the above.
> 
> Hah!  We can't even get all the existent qdev init() methods converted
> to realize(), which is a *much* simpler task.  And that conveniently
> ignores all the code that hasn't even made it to qdev.
> 
> We got bigger fish to fry.  In fact, the queue of fish waiting to be
> fried goes along the block a couple of times.
> 
>> ... As I said, I just wanted to express this opinion as another (not
>> really practical) data point. My children utterly hate spinach, so
>> Markus's counterpoint is definitely not lost on me.
> 
> To anyone advocating creating thousands (or even dozens) of non-trivial
> new error paths: come back with a test suite.  We have mountains of
> evidence for chronic inability to get error paths right.  In fact, I'm
> working on a patch that adds a bunch of FIXMEs for one class of flawed
> error recovery.  Just FIXMEs, because I don't dare fixing the bugs
> myself without a way to test the fix, and the fishes in the queue are
> looking at me accusingly already.
> 
> If I may suggest a logo for such a test suite: kids eating spinach with
> a smile feel apt.
>