From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1Ng3Ea-0005SH-1n for qemu-devel@nongnu.org; Fri, 12 Feb 2010 16:39:32 -0500 Received: from [199.232.76.173] (port=37189 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Ng3EY-0005S7-Mh for qemu-devel@nongnu.org; Fri, 12 Feb 2010 16:39:30 -0500 Received: from Debian-exim by monty-python.gnu.org with spam-scanned (Exim 4.60) (envelope-from ) id 1Ng3EX-0004U5-9M for qemu-devel@nongnu.org; Fri, 12 Feb 2010 16:39:30 -0500 Received: from e4.ny.us.ibm.com ([32.97.182.144]:50541) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1Ng3EW-0004Tf-Tg for qemu-devel@nongnu.org; Fri, 12 Feb 2010 16:39:29 -0500 Received: from d01relay01.pok.ibm.com (d01relay01.pok.ibm.com [9.56.227.233]) by e4.ny.us.ibm.com (8.14.3/8.13.1) with ESMTP id o1CLSmbS002516 for ; Fri, 12 Feb 2010 16:28:48 -0500 Received: from d01av03.pok.ibm.com (d01av03.pok.ibm.com [9.56.224.217]) by d01relay01.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id o1CLd9K7109978 for ; Fri, 12 Feb 2010 16:39:11 -0500 Received: from d01av03.pok.ibm.com (loopback [127.0.0.1]) by d01av03.pok.ibm.com (8.14.3/8.13.1/NCO v10.0 AVout) with ESMTP id o1CLd8lK011370 for ; Fri, 12 Feb 2010 19:39:09 -0200 Message-ID: <4B75CA7B.7070906@linux.vnet.ibm.com> Date: Fri, 12 Feb 2010 15:39:07 -0600 From: Anthony Liguori MIME-Version: 1.0 References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: [Qemu-devel] Re: QError conversion problems: putting errors in context List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Markus Armbruster Cc: qemu-devel@nongnu.org, Luiz Capitulino Hi Markus, On 02/12/2010 11:48 AM, Markus Armbruster wrote: > Our QError conversions were pretty straightforward so far. For example, > when we found > > monitor_printf(mon, "device is not removable\n"); > > in eject_device(), we created the obvious QError class for it: > > #define QERR_DEVICE_NOT_REMOVABLE \ > "{ 'class': 'DeviceNotRemovable', 'data': { 'device': %s } }" > > with the obvious pretty-print template: > > { > .error_fmt = QERR_DEVICE_NOT_REMOVABLE, > .desc = "Device %(device) is not removable", > }, > > and replaced the print with > > qemu_error_new(QERR_DEVICE_NOT_REMOVABLE, > bdrv_get_device_name(bs)); > > This even improved the message from "device is not removable" to "device > hda is not removable". Commit 2c2a6bb8. > > > We also ran into cases like this one, in qdev_device_add(): > > - qemu_error("Device \"%s\" not found. Try -device '?' for a list.\n", > - driver); > + qemu_error_new(QERR_DEVICE_NOT_FOUND, driver); > > Here, the pretty-print template became > > .desc = "The %(device) device has not been found", > > Note the loss of the helpful "Try -device '?' for a list." part. That's > because the same error is used in other places, where that piece of > advice doesn't apply. Commit 3ced9f7a. > > > For similar reasons, "Invalid CPU index" changed to the more generic > "Invalid parameter index" in monitor command cpu (commit cc0c4185). > More examples in my private tree: set_link's "invalid link status 'off'; > only 'up' or 'down' valid" becomes "Invalid parameter up_or_down", and > migrate's "migration already in progress" becomes "Already in progress". > > > Most of the conversions touched only monitor code. Things get much more > complicated for code shared between the monitor and other places. For > instance, there's this one in qdev_device_add(): > > qemu_error("-device: no driver specified\n"); > > Of course, this message is pretty pathetic already, in several ways: > > * A command line can contain several -device, and the error message > leaves finding the offending one to the user. > > * It's even worse with configuration files. FILE:LINENR is par for that > course. > > * It talks about -device even when we're in the monitor or a > configuration file. To trigger it in the monitor, try "device_add > a=b". > > From QMP's point of view, the appropriate error to use is this one: > > #define QERR_MISSING_PARAMETER \ > "{ 'class': 'MissingParameter', 'data': { 'name': %s } }" > > It's pretty-print template is > > .desc = "Parameter %(name) is missing", > > This changes the message "-device: no driver specified" to "Parameter > driver is missing". We go from bad to worse. > > > I think we have a few related but nevertheless distinct issues here: > > * We need to identify the error message's object. Proper identification > depends on the context in which the code is executing: > > - Command line: the option, with its argument, if any. > > - Config file: filename, line number. > > - Monitor: command (this is trivial) > > This is what's lacking in the "no driver specified" example, even > before conversion to QError. > > Dragging context information along all over the place so we can use it > to make better error messages would be cumbersome and invasive. There > must be a better way. > > Note that qemu_error_new() already has a primitive notion of context: > it distinguishes between human monitor, QMP monitor, and anything > else. This could be refined, so it can add suitable context > information to the error without encumbering its callers. For > instance, a hypothetical error message "pants on fire", emitted with > > qemu_error_new(QERR_PANTS_ON_FILE) > > should become > > - "qemu-system-x86_64: -device pants,color=black: pants on fire" in > the context of command line option "-device pants,color=black", > > - "vm.cfg:123: pants on fire" in the context of configuration file > vm.cfg, line 123, > > - "pants on fire" in the human monitor, and > > - { "error": { "class": "PantsOnFire", "data": {}, > "desc": "pants on fire" } } > in QMP. > > * Occasionally, we'd like to add help, but not for all uses of the same > error, and not in all contexts. > > This is the "Try -device '?' for a list" example. We'd like to print > that if we're working for -device (user qdev_device_add(), context > command line). And maybe "Try device_add ? for a list" if we're > working for device_add (same user, context human monitor). > > I believe this is rare enough to permit an ad hoc solution, like this: > > qemu_error_new(QERR_DEVICE_NOT_FOUND, driver); > printf_if_cmdline("Try -device '?' for a list.\n"); > printf_if_human_monitor("Try device_add ? for a list.\n"); > > Wouldn't work for the monitor right now, because the actual printing > of the error gets delayed until the handler returns, but that's > fixable. > > * Nice human-readable messages sometimes need information that should > not go into QMP. > > This is the "Migration already in progress" example. Since different > things can be in progress, it is tempting to do > > #define QERR_ALREADY_IN_PROGRESS \ > "{ 'class': 'AlreadyInProgress', 'data': { 'activity': %s } }" > > with pretty-print template > > .desc = "%(activity) already in progress" > > and use it like this: > > qemu_error_new(QERR_FAILED, "migration"); > > But that puts a rather pointless 'activity' member into the data > object. > > An obvious way to avoid that is to adopt a convention like "if the key > starts with '_', it's internal, and not to be passed over QMP". > > Another way (suggested by Anthony) is to permit overriding the > pretty-print template somehow. More flexible, but also more clumsy to > use. > > Comments? > Thanks for the write-up. Here's my view of how error reporting ought to work: 0) A user tries to do something via some piece of code that serves as the UI. This may be the -device option handling, the config file parser, QMP, or the human monitor. 1) Something, somewhere, generates an error. This is may be as primitive as an errno or as sophisticated as a QError object. In both cases, the error consists of structured data will well defined semantics. 2) That error is propagated up the call chain back to the UI code. 3) The UI code then decides how to present the error to the user. In an ideal world, qdev_device_add would either return a QError object or just an errno. The caller of qdev_device_add would then decide how to display that error to the user. The whole notion of having a description table baked into QError is wrong IMHO. You want as much contextual information as possible when generating an error for a user and the place you have that is at the point you're interacting with the user. It makes no sense to generate error messages deep in the bowels of QEMU. I'm a fan of decoupling the pretty printing from qerror.c and moving it into the monitor/command line parser. Regards, Anthony Liguori