From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36707) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WQ4p8-0001XV-Ba for qemu-devel@nongnu.org; Tue, 18 Mar 2014 20:57:44 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WQ4p2-00029d-Bm for qemu-devel@nongnu.org; Tue, 18 Mar 2014 20:57:38 -0400 Received: from ozlabs.org ([203.10.76.45]:60307) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WQ4p2-00029N-0w for qemu-devel@nongnu.org; Tue, 18 Mar 2014 20:57:32 -0400 From: Rusty Russell In-Reply-To: References: Date: Wed, 19 Mar 2014 11:04:19 +1030 Message-ID: <87lhw7rppw.fsf@rustcorp.com.au> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Subject: Re: [Qemu-devel] virtio device error reporting best practice? List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Dave Airlie , "qemu-devel@nongnu.org" Dave Airlie writes: > So I'm looking at how best to do virtio gpu device error reporting, > and how to deal with illegal stuff, > > I've two levels of errors I want to support, > > a) unrecoverable or bad guest kernel programming errors, The QEMU standard approach is to exit at this point. No, really. > b) per 3D context errors from the renderer backend, > > (b) I can easily report in an event queue and the guest kernel can in > theory blow away the offenders, this is how GL works with some > extensions, That's probably sanest. > For (a) I can expect a response from every command I put into the main > GPU control queue, the response should always be no error, but in some > cases it will be because the guest hit some host resource error, or > asked for something insane, (guest kernel drivers would be broken in > most of these cases). > > Alternately I can use the separate event queue to send async errors > when the guest does something bad, > > I'm also considering adding some sort of flag in config space saying > the device needs a reset before it will continue doing anything, I generally dislike error codes which Never Happen; it's like making every void function return int just in case: the caller has no idea what to do if it fails. The litmus test: does *your* guest handle failures other than by giving up on the device? If so, sure, you need to have a sane error-reporting strategy. > The main reason I'm considering this stuff is for security reasons if > the guest asks for something really illegal or crazy what should the > expected behaviour of the host be? (at least secure I know that). If the guest userspace can do it, don't exit. If the kernel only, and it's should have known better, abort is OK. Sure that doesn't help much! Rusty.