qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v1 0/1] block: Add numeric errno field to BLOCK_IO_ERROR events
@ 2017-12-22  0:11 Jack Schwartz
  2017-12-22  0:11 ` [Qemu-devel] [PATCH v1 1/1] " Jack Schwartz
  0 siblings, 1 reply; 8+ messages in thread
From: Jack Schwartz @ 2017-12-22  0:11 UTC (permalink / raw)
  To: qemu-devel, qemu-block
  Cc: kwolf, mreitz, armbru, eblake, karl.heubaum, konrad.wilk

Currently, BLOCK_IO_ERROR events have a string error "reason" field
which is derived from errno.  The proposed change adds errno itself
as a field to these events.  Figuring out the error by comparing the
(int) errno itself is easier than comparing a string.  There is also
a comment in the code that the reason field should not be parsed by
applications.

Sample QMP output of modified events adds the errno field as follows (see
last line):

{"timestamp": {"seconds": 1509071709, "microseconds": 563303}, "event":
"BLOCK_IO_ERROR", "data": {"device": "ide0-hd0", "node-name": "#block128",
"reason": "Input/output error", "operation": "write", "action": "ignore",
"errno": 5}}

Testing:
- Artificially created error conditions that emit BLOCK_IO_ERROR events.
Verified those events could be viewed by the QMP monitor and by the
qmp-shell; and that event behavior with those two utilities was identical.
- Ran tests via "make check" from the build root.  There were no changes
from vanilla build when building or running.

Homework:
- Looked through source and build trees for tests and scripts which
reference BLOCK_IO_ERROR events.  No direct references to such events were
found.  No direct references to BLOCK_IO_ERROR events implies there won't be
references to specific fields within those events.

- What about Windows?
  - The file block/block-backend.c is the only C file with a code change.
The file block/Makefile brings block-backend.o into both Windows and Linux
compilations.  The change introduces an additional reference to errno, which
strerror already calls, even with Windows.  That file's prior reference to
errno confirms that Windows will work with the code change.
  - If there would be a Linux vs Windows difference in mapping of errno to
error string values, that difference would have been in place before my
changes.

  Thanks,
  Jack

Jack Schwartz (1):
  block: Add numeric errno field to BLOCK_IO_ERROR events

 block/block-backend.c |  2 +-
 qapi/block-core.json  | 12 ++++++++++--
 2 files changed, 11 insertions(+), 3 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Qemu-devel] [PATCH v1 1/1] block: Add numeric errno field to BLOCK_IO_ERROR events
  2017-12-22  0:11 [Qemu-devel] [PATCH v1 0/1] block: Add numeric errno field to BLOCK_IO_ERROR events Jack Schwartz
@ 2017-12-22  0:11 ` Jack Schwartz
  2017-12-22  1:08   ` Eric Blake
  2017-12-22 13:52   ` [Qemu-devel] " Kevin Wolf
  0 siblings, 2 replies; 8+ messages in thread
From: Jack Schwartz @ 2017-12-22  0:11 UTC (permalink / raw)
  To: qemu-devel, qemu-block
  Cc: kwolf, mreitz, armbru, eblake, karl.heubaum, konrad.wilk

BLOCK_IO_ERROR events currently contain a "reason" string which is
strerror(errno) of the error.  This enhancement provides those events with
the numeric errno value as well, since it is easier to parse for error type
than a string.

Signed-off-by: Jack Schwartz <jack.schwartz@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Karl Heubaum <karl.heubaum@oracle.com>
---
 block/block-backend.c |  2 +-
 qapi/block-core.json  | 12 ++++++++++--
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index baef8e7..f628668 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1572,7 +1572,7 @@ static void send_qmp_error_event(BlockBackend *blk,
     qapi_event_send_block_io_error(blk_name(blk),
                                    bdrv_get_node_name(blk_bs(blk)), optype,
                                    action, blk_iostatus_is_enabled(blk),
-                                   error == ENOSPC, strerror(error),
+                                   error == ENOSPC, error, strerror(error),
                                    &error_abort);
 }
 
diff --git a/qapi/block-core.json b/qapi/block-core.json
index a8cdbc3..b7beca7 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3660,6 +3660,11 @@
 #           io-status is present, please see query-block documentation
 #           for more information (since: 2.2)
 #
+# @errno: int describing the error cause, provided for applications.
+#         (Note: while most errnos are posix compliant between OSs, it
+#         is possible some errno values can vary among different OSs.)
+#         (since 2.12)
+#
 # @reason: human readable string describing the error cause.
 #          (This field is a debugging aid for humans, it should not
 #           be parsed by applications) (since: 2.2)
@@ -3675,14 +3680,17 @@
 #      "data": { "device": "ide0-hd1",
 #                "node-name": "#block212",
 #                "operation": "write",
-#                "action": "stop" },
+#                "action": "stop",
+#                "nospace": false,
+#                "errno": 5,
+#                "reason": "Input/output error" },
 #      "timestamp": { "seconds": 1265044230, "microseconds": 450486 } }
 #
 ##
 { 'event': 'BLOCK_IO_ERROR',
   'data': { 'device': 'str', 'node-name': 'str', 'operation': 'IoOperationType',
             'action': 'BlockErrorAction', '*nospace': 'bool',
-            'reason': 'str' } }
+            'errno': 'int', 'reason': 'str' } }
 
 ##
 # @BLOCK_JOB_COMPLETED:
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [PATCH v1 1/1] block: Add numeric errno field to BLOCK_IO_ERROR events
  2017-12-22  0:11 ` [Qemu-devel] [PATCH v1 1/1] " Jack Schwartz
@ 2017-12-22  1:08   ` Eric Blake
  2017-12-22  1:15     ` [Qemu-devel] [Qemu-block] " Eric Blake
  2017-12-22 13:52   ` [Qemu-devel] " Kevin Wolf
  1 sibling, 1 reply; 8+ messages in thread
From: Eric Blake @ 2017-12-22  1:08 UTC (permalink / raw)
  To: Jack Schwartz, qemu-devel, qemu-block
  Cc: kwolf, mreitz, armbru, karl.heubaum, konrad.wilk

On 12/21/2017 06:11 PM, Jack Schwartz wrote:
> BLOCK_IO_ERROR events currently contain a "reason" string which is
> strerror(errno) of the error.  This enhancement provides those events with
> the numeric errno value as well, since it is easier to parse for error type
> than a string.

NACK.  Numeric errno values are platform-dependent, but QMP must be 
platform-independent.  If you want to expose errno NAMES (not values), 
then create a QAPI enum and add the enum to the error structure (so that 
you are still passing names, not int values, over the wire).

> +++ b/qapi/block-core.json
> @@ -3660,6 +3660,11 @@
>   #           io-status is present, please see query-block documentation
>   #           for more information (since: 2.2)
>   #
> +# @errno: int describing the error cause, provided for applications.
> +#         (Note: while most errnos are posix compliant between OSs, it
> +#         is possible some errno values can vary among different OSs.)
> +#         (since 2.12)

The proof is in the pudding - if your documentation has to give this big 
disclaimer, then what you are adding is not portable and should not be 
added in that manner.

> +#
>   # @reason: human readable string describing the error cause.
>   #          (This field is a debugging aid for humans, it should not
>   #           be parsed by applications) (since: 2.2)
> @@ -3675,14 +3680,17 @@
>   #      "data": { "device": "ide0-hd1",
>   #                "node-name": "#block212",
>   #                "operation": "write",
> -#                "action": "stop" },
> +#                "action": "stop",
> +#                "nospace": false,
> +#                "errno": 5,

So this should be "errno":"ENOSPC", not 5.

> +#                "reason": "Input/output error" },
>   #      "timestamp": { "seconds": 1265044230, "microseconds": 450486 } }
>   #
>   ##
>   { 'event': 'BLOCK_IO_ERROR',
>     'data': { 'device': 'str', 'node-name': 'str', 'operation': 'IoOperationType',
>               'action': 'BlockErrorAction', '*nospace': 'bool',
> -            'reason': 'str' } }
> +            'errno': 'int', 'reason': 'str' } }

and this should be the name of a QAPI enum type, not 'int'.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH v1 1/1] block: Add numeric errno field to BLOCK_IO_ERROR events
  2017-12-22  1:08   ` Eric Blake
@ 2017-12-22  1:15     ` Eric Blake
  0 siblings, 0 replies; 8+ messages in thread
From: Eric Blake @ 2017-12-22  1:15 UTC (permalink / raw)
  To: Jack Schwartz, qemu-devel, qemu-block; +Cc: kwolf, armbru, konrad.wilk, mreitz

On 12/21/2017 07:08 PM, Eric Blake wrote:

>>   #
>> +# @errno: int describing the error cause, provided for applications.
>> +#         (Note: while most errnos are posix compliant between OSs, it
>> +#         is possible some errno values can vary among different OSs.)
>> +#         (since 2.12)
> 
> The proof is in the pudding - if your documentation has to give this big 
> disclaimer, then what you are adding is not portable and should not be 
> added in that manner.

To follow up to myself, POSIX explicitly says that errno values are 
implementation dependent, and there is NO requirement that errno value 1 
be EPERM, for example.  And while qemu does not target GNU Hurd, that is 
a classic example of a system where errno values intentionally do not 
fit in 8 bits.  So you can't argue that there are "POSIX-compliant errno 
values", because POSIX doesn't mandate specific values.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [PATCH v1 1/1] block: Add numeric errno field to BLOCK_IO_ERROR events
  2017-12-22  0:11 ` [Qemu-devel] [PATCH v1 1/1] " Jack Schwartz
  2017-12-22  1:08   ` Eric Blake
@ 2017-12-22 13:52   ` Kevin Wolf
  2018-01-08 19:57     ` Jack Schwartz
  1 sibling, 1 reply; 8+ messages in thread
From: Kevin Wolf @ 2017-12-22 13:52 UTC (permalink / raw)
  To: Jack Schwartz
  Cc: qemu-devel, qemu-block, mreitz, armbru, eblake, karl.heubaum,
	konrad.wilk

Am 22.12.2017 um 01:11 hat Jack Schwartz geschrieben:
> BLOCK_IO_ERROR events currently contain a "reason" string which is
> strerror(errno) of the error.  This enhancement provides those events with
> the numeric errno value as well, since it is easier to parse for error type
> than a string.
> 
> Signed-off-by: Jack Schwartz <jack.schwartz@oracle.com>
> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Reviewed-by: Karl Heubaum <karl.heubaum@oracle.com>

Apart from the technical details that Eric mentioed, I wonder what is
your use case for this?

Exposing errors in a machine readable form was discussed earlier, and
the result was that nobody had an actual use for error codes other than
presenting the right error message to the user - which the error string
already achieves.

The only exception so far was ENOSPC, which some management tools like
oVirt respond to by increasing the volume size, so this was mapped into
a bool.

Kevin

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [PATCH v1 1/1] block: Add numeric errno field to BLOCK_IO_ERROR events
  2017-12-22 13:52   ` [Qemu-devel] " Kevin Wolf
@ 2018-01-08 19:57     ` Jack Schwartz
  2018-01-09 10:24       ` Kevin Wolf
  0 siblings, 1 reply; 8+ messages in thread
From: Jack Schwartz @ 2018-01-08 19:57 UTC (permalink / raw)
  To: Kevin Wolf, eblake
  Cc: qemu-block, Konrad Rzeszutek Wilk, armbru, qemu-devel, mreitz,
	Karl Heubaum

Hi Kevin.

On 2017-12-22 05:52, Kevin Wolf wrote:
> Am 22.12.2017 um 01:11 hat Jack Schwartz geschrieben:
>> BLOCK_IO_ERROR events currently contain a "reason" string which is
>> strerror(errno) of the error.  This enhancement provides those events with
>> the numeric errno value as well, since it is easier to parse for error type
>> than a string.
>>
>> Signed-off-by: Jack Schwartz<jack.schwartz@oracle.com>
>> Reviewed-by: Konrad Rzeszutek Wilk<konrad.wilk@oracle.com>
>> Reviewed-by: Karl Heubaum<karl.heubaum@oracle.com>
> Apart from the technical details that Eric mentioed, I wonder what is
> your use case for this?
We have thousands of servers in our cloud, and would like to closely 
monitor for different kinds of disk errors without parsing the 
non-machine-readable error string.
> Exposing errors in a machine readable form was discussed earlier,
OK, found it.  April of 2010.

Upshot of discussion: exposing naked errnos are platform dependent.
>   and
> the result was that nobody had an actual use for error codes other than
> presenting the right error message to the user - which the error string
> already achieves.
Given the platform independence requirement, exposing errors to clients 
is not that simple given that different OSs use different errno values.  
Other options/considerations than exposing naked errno values:

- Having a platform-independent enumeration of errors, as Eric 
suggested.  This would have to explicitly set an enumerated value for 
each individual errno we are interested in.  It would be returned in a 
field that ~parallels the "reason" string.  This should be OK since for 
BLOCK_IO_ERROR events we could limit values to just storage device 
errors plus a default "other"; otherwise this could be hard to maintain.

- The strerror strings cannot be used because they can change with 
locale. (This also assumes the strings are identical for given errnos 
cross-platform, and that there are no typos - which are not 
automatically checked-for.)

     Thanks,
     Jack

P.S.  Please excuse the delayed reply due to vacation / company shutdown.
> The only exception so far was ENOSPC, which some management tools like
> oVirt respond to by increasing the volume size, so this was mapped into
> a bool.
>
> Kevin
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [PATCH v1 1/1] block: Add numeric errno field to BLOCK_IO_ERROR events
  2018-01-08 19:57     ` Jack Schwartz
@ 2018-01-09 10:24       ` Kevin Wolf
  2018-01-10 21:28         ` Jack Schwartz
  0 siblings, 1 reply; 8+ messages in thread
From: Kevin Wolf @ 2018-01-09 10:24 UTC (permalink / raw)
  To: Jack Schwartz
  Cc: eblake, qemu-block, Konrad Rzeszutek Wilk, armbru, qemu-devel,
	mreitz, Karl Heubaum

Am 08.01.2018 um 20:57 hat Jack Schwartz geschrieben:
> Hi Kevin.
> 
> On 2017-12-22 05:52, Kevin Wolf wrote:
> > Am 22.12.2017 um 01:11 hat Jack Schwartz geschrieben:
> > > BLOCK_IO_ERROR events currently contain a "reason" string which is
> > > strerror(errno) of the error.  This enhancement provides those events with
> > > the numeric errno value as well, since it is easier to parse for error type
> > > than a string.
> > > 
> > > Signed-off-by: Jack Schwartz<jack.schwartz@oracle.com>
> > > Reviewed-by: Konrad Rzeszutek Wilk<konrad.wilk@oracle.com>
> > > Reviewed-by: Karl Heubaum<karl.heubaum@oracle.com>
> > Apart from the technical details that Eric mentioed, I wonder what is
> > your use case for this?
> We have thousands of servers in our cloud, and would like to closely monitor
> for different kinds of disk errors without parsing the non-machine-readable
> error string.

So do you actually care about the semantical difference between, say,
EINVAL and EIO, and treat them differently in the monitoring? To be
honest, I can't see anything useful you could do with this information
because there are so many possible causes for each of the error codes.

Because if the only thing you want to do with them is to log them in
different categories, you can use the error strings without parsing
them.

> > Exposing errors in a machine readable form was discussed earlier,
> OK, found it.  April of 2010.
> 
> Upshot of discussion: exposing naked errnos are platform dependent.

Right, that's what Eric mentioned.

> >   and
> > the result was that nobody had an actual use for error codes other than
> > presenting the right error message to the user - which the error string
> > already achieves.
> Given the platform independence requirement, exposing errors to clients is
> not that simple given that different OSs use different errno values.  Other
> options/considerations than exposing naked errno values:
> 
> - Having a platform-independent enumeration of errors, as Eric suggested. 
> This would have to explicitly set an enumerated value for each individual
> errno we are interested in.  It would be returned in a field that ~parallels
> the "reason" string.  This should be OK since for BLOCK_IO_ERROR events we
> could limit values to just storage device errors plus a default "other";
> otherwise this could be hard to maintain.

But what are "storage device errors"? Can't we get more or less any
error while processing an I/O request?

> - The strerror strings cannot be used because they can change with locale.
> (This also assumes the strings are identical for given errnos
> cross-platform, and that there are no typos - which are not automatically
> checked-for.)

You mean when you aggregate errors from multiple different hosts running
on different platforms and where you don't control the locale?

But cross-platform, even the exact numeric error codes you get may
differ, so they become even less meaningful than they already are on a
single platform.

Kevin

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [PATCH v1 1/1] block: Add numeric errno field to BLOCK_IO_ERROR events
  2018-01-09 10:24       ` Kevin Wolf
@ 2018-01-10 21:28         ` Jack Schwartz
  0 siblings, 0 replies; 8+ messages in thread
From: Jack Schwartz @ 2018-01-10 21:28 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: qemu-block, Konrad Rzeszutek Wilk, armbru, qemu-devel, mreitz,
	Karl Heubaum

Hi Kevin.

Thanks for your feedback.

Looks like my team's project plans have changed, and there is no need to 
pursue this further.  We can work with the existing reason string.

         Thanks,
         Jack

On 01/09/18 02:24, Kevin Wolf wrote:
> Am 08.01.2018 um 20:57 hat Jack Schwartz geschrieben:
>> Hi Kevin.
>>
>> On 2017-12-22 05:52, Kevin Wolf wrote:
>>> Am 22.12.2017 um 01:11 hat Jack Schwartz geschrieben:
>>>> BLOCK_IO_ERROR events currently contain a "reason" string which is
>>>> strerror(errno) of the error.  This enhancement provides those events with
>>>> the numeric errno value as well, since it is easier to parse for error type
>>>> than a string.
>>>>
>>>> Signed-off-by: Jack Schwartz<jack.schwartz@oracle.com>
>>>> Reviewed-by: Konrad Rzeszutek Wilk<konrad.wilk@oracle.com>
>>>> Reviewed-by: Karl Heubaum<karl.heubaum@oracle.com>
>>> Apart from the technical details that Eric mentioed, I wonder what is
>>> your use case for this?
>> We have thousands of servers in our cloud, and would like to closely monitor
>> for different kinds of disk errors without parsing the non-machine-readable
>> error string.
> So do you actually care about the semantical difference between, say,
> EINVAL and EIO, and treat them differently in the monitoring? To be
> honest, I can't see anything useful you could do with this information
> because there are so many possible causes for each of the error codes.
>
> Because if the only thing you want to do with them is to log them in
> different categories, you can use the error strings without parsing
> them.
>
>>> Exposing errors in a machine readable form was discussed earlier,
>> OK, found it.  April of 2010.
>>
>> Upshot of discussion: exposing naked errnos are platform dependent.
> Right, that's what Eric mentioned.
>
>>>    and
>>> the result was that nobody had an actual use for error codes other than
>>> presenting the right error message to the user - which the error string
>>> already achieves.
>> Given the platform independence requirement, exposing errors to clients is
>> not that simple given that different OSs use different errno values.  Other
>> options/considerations than exposing naked errno values:
>>
>> - Having a platform-independent enumeration of errors, as Eric suggested.
>> This would have to explicitly set an enumerated value for each individual
>> errno we are interested in.  It would be returned in a field that ~parallels
>> the "reason" string.  This should be OK since for BLOCK_IO_ERROR events we
>> could limit values to just storage device errors plus a default "other";
>> otherwise this could be hard to maintain.
> But what are "storage device errors"? Can't we get more or less any
> error while processing an I/O request?
>
>> - The strerror strings cannot be used because they can change with locale.
>> (This also assumes the strings are identical for given errnos
>> cross-platform, and that there are no typos - which are not automatically
>> checked-for.)
> You mean when you aggregate errors from multiple different hosts running
> on different platforms and where you don't control the locale?
>
> But cross-platform, even the exact numeric error codes you get may
> differ, so they become even less meaningful than they already are on a
> single platform.
>
> Kevin
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-01-10 21:28 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-12-22  0:11 [Qemu-devel] [PATCH v1 0/1] block: Add numeric errno field to BLOCK_IO_ERROR events Jack Schwartz
2017-12-22  0:11 ` [Qemu-devel] [PATCH v1 1/1] " Jack Schwartz
2017-12-22  1:08   ` Eric Blake
2017-12-22  1:15     ` [Qemu-devel] [Qemu-block] " Eric Blake
2017-12-22 13:52   ` [Qemu-devel] " Kevin Wolf
2018-01-08 19:57     ` Jack Schwartz
2018-01-09 10:24       ` Kevin Wolf
2018-01-10 21:28         ` Jack Schwartz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).