KCIDB: Test schema enhancements

kernelci.lists.linux.dev archive mirror
 help / color / mirror / Atom feed

* KCIDB: Test schema enhancements
@ 2024-08-05  9:56 Nikolai Kondrashov
  2024-08-05 11:28 ` KCIDB: Support non-binary test outputs Nikolai Kondrashov
                   ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: Nikolai Kondrashov @ 2024-08-05  9:56 UTC (permalink / raw)
  To: syzkaller, Dmitry Vyukov, Vishal Bhoj, Alice Ferrazzi,
	automated-testing, Cristian Marussi, Tim Bird, Johnson George,
	Veronika Kabatova, Guillaume Tucker, kernelci@lists.linux.dev,
	Don Zickus, Mark Brown, Philip Li, Denys Fedoryshchenko,
	Michael Hofmann, Tales da Aparecida, Aditya Nagesh,
	Jeny Dhruvit Sheth, Sachin Sant, Hambardzumyan, Minas

Hello everyone (potentially) involved with sending data to KCIDB,

We've been working hard at KernelCI to bring you a bunch of exciting changes,
to be presented at LPC in September, and in our news channels.

Meanwhile I'd like to propose two small, but potentially very useful changes
to the I/O schema for tests:

* Supporting non-binary outputs beyond PASS/FAIL -
  integers/floats/booleans/strings/etc. - useful for performance tests.

* Supporting recording `compatible` values from the top of the device tree
  inside the test environment, for machines which use them - useful for
  correlating test results by hardware.

Here's the corresponding schema PR:

    https://github.com/kernelci/kcidb-io/pull/85

I'll follow up with a separate message for each of the changes, going over the
details and the rationale.

Feel free to reply to this or the other messages and leave comments in the PR!

Nick

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: KCIDB: Support non-binary test outputs
  2024-08-05  9:56 KCIDB: Test schema enhancements Nikolai Kondrashov
@ 2024-08-05 11:28 ` Nikolai Kondrashov
  2024-08-05 12:33   ` Mark Brown
                     ` (2 more replies)
  2024-08-05 12:28 ` KCIDB: Test schema enhancements Nikolai Kondrashov
  2024-08-05 12:34 ` Nikolai Kondrashov
  2 siblings, 3 replies; 26+ messages in thread
From: Nikolai Kondrashov @ 2024-08-05 11:28 UTC (permalink / raw)
  To: syzkaller, Dmitry Vyukov, Vishal Bhoj, Alice Ferrazzi,
	automated-testing, Cristian Marussi, Tim Bird, Johnson George,
	Veronika Kabatova, Guillaume Tucker, kernelci@lists.linux.dev,
	Don Zickus, Mark Brown, Philip Li, Denys Fedoryshchenko,
	Michael Hofmann, Tales da Aparecida, Aditya Nagesh,
	Jeny Dhruvit Sheth, Sachin Sant, Hambardzumyan, Minas

Hi everyone,

On 8/5/24 12:56 PM, Nikolai Kondrashov wrote:
> Meanwhile I'd like to propose two small, but potentially very useful changes
> to the I/O schema for tests:
>
> * Supporting non-binary outputs beyond PASS/FAIL -
>   integers/floats/booleans/strings/etc. - useful for performance tests.
>
> * Supporting recording `compatible` values from the top of the device tree
>   inside the test environment, for machines which use them - useful for
>   correlating test results by hardware.
>
> Here's the corresponding schema PR:
>
>     https://github.com/kernelci/kcidb-io/pull/85
>
> I'll follow up with a separate message for each of the changes, going over the
> details and the rationale.

The support for non-binary test outputs introduces a new field to the "test"
objects called "value", being an "object" itself. It (and its abstract
meaning) are supposed to work together with the "status" field. That is, it
should only be considered when the test has actually executed. I.e. with a
"FAIL", "ERROR", "PASS", or "DONE" status only. Normally "DONE" should be
used, when the value is the test's output, and not an auxiliary value.

Here is the (abbreviated) schema for the new field:

    "value": {
        "type": "object",
        "properties": {
            "integer": {"type": "integer"},
            "number": {"type": "number"},
            "string": {"type": "string"},
            "boolean": {"type": "boolean"},
        },
        "minProperties": 1,
        "additionalProperties": False
    },

The meaning of the value itself depends on the particular test, that is the
"path" field value. Each property inside the value corresponds to a data type.
At least one must be specified, but if more than one type property is set,
each is considered a different representation of the *same* value, and not a
different value. E.g. these can be specified at the same time: "integer": 1,
"number": 1, "boolean": true, "string": "true".

Specifying multiple types at once can be used to assist transitioning
test output to a different type, but in normal use only one of them
should be supplied.

Here's an example test object using the "value" field:

{
    "id": "redhat:2876829c98e9878766a",
    "origin": "redhat",
    "build_id": "redhat:387d3459ef",
    "path":"redhat_ext4fs.performance.iops",
    "comment": "Red Hat Ext4 FS I/O performance, IOPS"
    "status": "DONE",
    "value": {"integer": 57324}
}

If the test in question has multiple values to report, then the submitting CI
system should create (synthetic) subtests under the test node, one for each
value. E.g. if the (imaginary) test above had random-read and random-write
IOPS to report separately, it could've been expressed as such:

"tests": [
    {
        "id": "redhat:2876829c98e9878766a",
        "origin": "redhat",
        "build_id": "redhat:387d3459ef",
        "path":"redhat_ext4fs.performance",
        "comment": "Red Hat Ext4 FS I/O performance"
        "status": "DONE"
    },
    {
        "id": "redhat:2876829c98e9878766a:rriops",
        "origin": "redhat",
        "build_id": "redhat:387d3459ef",
        "path":"redhat_ext4fs.performance.rriops",
        "comment": "Red Hat Ext4 FS I/O performance, random-read IOPS"
        "status": "DONE",
        "value": {"integer": 97524}
    },
    {
        "id": "redhat:2876829c98e9878766a:rwiops",
        "origin": "redhat",
        "build_id": "redhat:387d3459ef",
        "path":"redhat_ext4fs.performance.rwiops",
        "comment": "Red Hat Ext4 FS I/O performance, random-write IOPS"
        "status": "DONE",
        "value": {"integer": 46434}
    }
]

This separation allows us to control complexity, while at the same time
allowing KCIDB to analyze and track results in a more-or-less generic way.

Finally, if you need to monitor both the separate results and a combined
result, you simply add it to the parent test. Like adding back the overall
IOPS value to the parent test above.

Don't hesitate to send your comments/questions/objections here or in the PR!

Nick

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: KCIDB: Test schema enhancements
  2024-08-05  9:56 KCIDB: Test schema enhancements Nikolai Kondrashov
  2024-08-05 11:28 ` KCIDB: Support non-binary test outputs Nikolai Kondrashov
@ 2024-08-05 12:28 ` Nikolai Kondrashov
  2024-08-06 17:39   ` [Automated-testing] " Nikolai Kondrashov
  2024-08-05 12:34 ` Nikolai Kondrashov
  2 siblings, 1 reply; 26+ messages in thread
From: Nikolai Kondrashov @ 2024-08-05 12:28 UTC (permalink / raw)
  To: syzkaller, Dmitry Vyukov, Vishal Bhoj, Alice Ferrazzi,
	automated-testing, Cristian Marussi, Tim Bird, Johnson George,
	Veronika Kabatova, kernelci@lists.linux.dev, Don Zickus,
	Mark Brown, Philip Li, Denys Fedoryshchenko, Michael Hofmann,
	Tales da Aparecida, Aditya Nagesh, Jeny Dhruvit Sheth,
	Sachin Sant, Hambardzumyan, Minas

On 8/5/24 12:56 PM, Nikolai Kondrashov wrote:
> Meanwhile I'd like to propose two small, but potentially very useful changes
> to the I/O schema for tests:
>
> * Supporting non-binary outputs beyond PASS/FAIL -
>   integers/floats/booleans/strings/etc. - useful for performance tests.
>
> * Supporting recording `compatible` values from the top of the device tree
>   inside the test environment, for machines which use them - useful for
>   correlating test results by hardware.
>
> Here's the corresponding schema PR:
>
>     https://github.com/kernelci/kcidb-io/pull/85
>
> I'll follow up with a separate message for each of the changes, going over the
> details and the rationale.

The "compatible" property in the root of a device tree specifies the device
vendor, the device (board) model, as well as often the device SoC and family.
These are generally encoded as several strings, ordered from most to least
specific, each potentially containing multiple parts separated by commas.
E.g.:

    "ti,omap3-beagleboard", "ti,omap3450", "ti,omap3"

See more here, for example:

https://docs.kernel.org/devicetree/usage-model.html#platform-identification

Here's the (abbreviated) schema for the new "compatible" field, added to the
"test"'s "environment" object:

    "type": "array",
    "items": {
        "type": "string",
        "pattern": "^[^ ]+(,[^ ]+)*$"
    }

The particular regular expression is based on all I was able to find about
restrictions on the "compatible" values.

All of the above, of course, applies to systems using the device tree, only.
We will have to come up with something else for other systems (ACPI ID?), but
we can already benefit from "compatible", as it's relatively well-defined,
well-recognized, flexible, and the community is already looking after its
consistency and usefulness, so we won't have to come up with anything new
ourselves.

The database implementation for this would also let us correlate results on
more general levels than exact boards, e.g. SOCs, families, and vendors.

For the latter there's already an effort to document actual vendor names,
which would be useful in making dashboards nicer:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/devicetree/bindings/vendor-prefixes.yaml

Don't hesitate to comment here, or in the PR!

Nick

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: KCIDB: Support non-binary test outputs
  2024-08-05 11:28 ` KCIDB: Support non-binary test outputs Nikolai Kondrashov
@ 2024-08-05 12:33   ` Mark Brown
  2024-08-05 14:26     ` Nikolai Kondrashov
  2024-08-05 17:11     ` Bird, Tim
  2024-08-05 17:02   ` Bird, Tim
  2024-08-16 10:03   ` [Automated-testing] KCIDB: Support non-binary test outputs V2 Nikolai Kondrashov
  2 siblings, 2 replies; 26+ messages in thread
From: Mark Brown @ 2024-08-05 12:33 UTC (permalink / raw)
  To: Nikolai Kondrashov
  Cc: syzkaller, Dmitry Vyukov, Vishal Bhoj, Alice Ferrazzi,
	automated-testing, Cristian Marussi, Tim Bird, Johnson George,
	Veronika Kabatova, Guillaume Tucker, kernelci@lists.linux.dev,
	Don Zickus, Philip Li, Denys Fedoryshchenko, Michael Hofmann,
	Tales da Aparecida, Aditya Nagesh, Jeny Dhruvit Sheth,
	Sachin Sant, Hambardzumyan, Minas

[-- Attachment #1: Type: text/plain, Size: 1146 bytes --]

On Mon, Aug 05, 2024 at 02:28:24PM +0300, Nikolai Kondrashov wrote:

> Here is the (abbreviated) schema for the new field:

>     "value": {
>         "type": "object",
>         "properties": {
>             "integer": {"type": "integer"},
>             "number": {"type": "number"},
>             "string": {"type": "string"},
>             "boolean": {"type": "boolean"},
>         },
>         "minProperties": 1,
>         "additionalProperties": False
>     },

> The meaning of the value itself depends on the particular test, that is the
> "path" field value. Each property inside the value corresponds to a data type.

Might it be useful to directly specify units for use with "number" to
help with normalising data between different CI systems or hardware -
for example with boot times both seconds and miliseconds seem like
reasonable units to use?  It might also be useful for UIs, though we
could also do that with a separate table for the tests that they can
query.  Perhaps I'm just worrying too much about specialist cases where
it's likely that CI systems won't just be picking up an off the shelf
suite that has standard units.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: KCIDB: Test schema enhancements
  2024-08-05  9:56 KCIDB: Test schema enhancements Nikolai Kondrashov
  2024-08-05 11:28 ` KCIDB: Support non-binary test outputs Nikolai Kondrashov
  2024-08-05 12:28 ` KCIDB: Test schema enhancements Nikolai Kondrashov
@ 2024-08-05 12:34 ` Nikolai Kondrashov
  2 siblings, 0 replies; 26+ messages in thread
From: Nikolai Kondrashov @ 2024-08-05 12:34 UTC (permalink / raw)
  To: syzkaller, Dmitry Vyukov, Vishal Bhoj, Alice Ferrazzi,
	automated-testing, Cristian Marussi, Tim Bird, Johnson George,
	Veronika Kabatova, Guillaume Tucker, kernelci@lists.linux.dev,
	Don Zickus, Mark Brown, Philip Li, Denys Fedoryshchenko,
	Michael Hofmann, Tales da Aparecida, Aditya Nagesh,
	Jeny Dhruvit Sheth, Sachin Sant, Hambardzumyan, Minas

On 8/5/24 12:56 PM, Nikolai Kondrashov wrote:
> Hello everyone (potentially) involved with sending data to KCIDB,
> 
> We've been working hard at KernelCI to bring you a bunch of exciting changes,
> to be presented at LPC in September, and in our news channels.
> 
> Meanwhile I'd like to propose two small, but potentially very useful changes
> to the I/O schema for tests:
> 
> * Supporting non-binary outputs beyond PASS/FAIL -
>   integers/floats/booleans/strings/etc. - useful for performance tests.
> 
> * Supporting recording `compatible` values from the top of the device tree
>   inside the test environment, for machines which use them - useful for
>   correlating test results by hardware.
> 
> Here's the corresponding schema PR:
> 
>     https://github.com/kernelci/kcidb-io/pull/85
> 
> I'll follow up with a separate message for each of the changes, going over the
> details and the rationale.
> 
> Feel free to reply to this or the other messages and leave comments in the PR!

If there are no objections by that time, I would like to merge this next
Monday, Aug 12, and start working on the rest of the support for these
enhancements.

Don't hesitate to object to *that*, if you feel it's too fast!

Nick

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: KCIDB: Support non-binary test outputs
  2024-08-05 12:33   ` Mark Brown
@ 2024-08-05 14:26     ` Nikolai Kondrashov
  2024-08-05 16:32       ` Mark Brown
  2024-08-05 17:11     ` Bird, Tim
  1 sibling, 1 reply; 26+ messages in thread
From: Nikolai Kondrashov @ 2024-08-05 14:26 UTC (permalink / raw)
  To: Mark Brown
  Cc: syzkaller, Dmitry Vyukov, Vishal Bhoj, Alice Ferrazzi,
	automated-testing, Cristian Marussi, Tim Bird, Johnson George,
	Veronika Kabatova, Guillaume Tucker, kernelci@lists.linux.dev,
	Don Zickus, Philip Li, Denys Fedoryshchenko, Michael Hofmann,
	Tales da Aparecida, Aditya Nagesh, Jeny Dhruvit Sheth,
	Sachin Sant, Hambardzumyan, Minas

On 8/5/24 3:33 PM, Mark Brown wrote:
> On Mon, Aug 05, 2024 at 02:28:24PM +0300, Nikolai Kondrashov wrote:
> 
>> Here is the (abbreviated) schema for the new field:
> 
>>     "value": {
>>         "type": "object",
>>         "properties": {
>>             "integer": {"type": "integer"},
>>             "number": {"type": "number"},
>>             "string": {"type": "string"},
>>             "boolean": {"type": "boolean"},
>>         },
>>         "minProperties": 1,
>>         "additionalProperties": False
>>     },
> 
>> The meaning of the value itself depends on the particular test, that is the
>> "path" field value. Each property inside the value corresponds to a data type.
> 
> Might it be useful to directly specify units for use with "number" to
> help with normalising data between different CI systems or hardware -
> for example with boot times both seconds and miliseconds seem like
> reasonable units to use?  It might also be useful for UIs, though we
> could also do that with a separate table for the tests that they can
> query.  Perhaps I'm just worrying too much about specialist cases where
> it's likely that CI systems won't just be picking up an off the shelf
> suite that has standard units.
This is a totally valid concern. We could add a "units" field, e.g. beside
"value". However, I'm not sure how we could use it. Sure, we can put them next
to the value in the dashboard, which would look nice, but then we can also put
them into the "comment", as my examples do, and I'm not sure if we would be
able to do much with them in the database.

Even if we have the exponent separate, I'm not sure we can make use of it
(converting units on the fly won't work well with indices). I mean, we can
(and should) compare only values with the same unit, but which one, if we have
multiple? Both separately?

On the one hand I like having the explicit unit, on the other hand we can get
a similar result with simply using different test names for different units,
and have them specified in the comments, for humans 🤔

Nick

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: KCIDB: Support non-binary test outputs
  2024-08-05 14:26     ` Nikolai Kondrashov
@ 2024-08-05 16:32       ` Mark Brown
  2024-08-06 17:03         ` Nikolai Kondrashov
  0 siblings, 1 reply; 26+ messages in thread
From: Mark Brown @ 2024-08-05 16:32 UTC (permalink / raw)
  To: Nikolai Kondrashov
  Cc: syzkaller, Dmitry Vyukov, Vishal Bhoj, Alice Ferrazzi,
	automated-testing, Cristian Marussi, Tim Bird, Johnson George,
	Veronika Kabatova, Guillaume Tucker, kernelci@lists.linux.dev,
	Don Zickus, Philip Li, Denys Fedoryshchenko, Michael Hofmann,
	Tales da Aparecida, Aditya Nagesh, Jeny Dhruvit Sheth,
	Sachin Sant, Hambardzumyan, Minas

[-- Attachment #1: Type: text/plain, Size: 1573 bytes --]

On Mon, Aug 05, 2024 at 05:26:52PM +0300, Nikolai Kondrashov wrote:
> On 8/5/24 3:33 PM, Mark Brown wrote:
> > On Mon, Aug 05, 2024 at 02:28:24PM +0300, Nikolai Kondrashov wrote:

> > Might it be useful to directly specify units for use with "number" to
> > help with normalising data between different CI systems or hardware -

> This is a totally valid concern. We could add a "units" field, e.g. beside
> "value". However, I'm not sure how we could use it. Sure, we can put them next
> to the value in the dashboard, which would look nice, but then we can also put
> them into the "comment", as my examples do, and I'm not sure if we would be
> able to do much with them in the database.

> Even if we have the exponent separate, I'm not sure we can make use of it
> (converting units on the fly won't work well with indices). I mean, we can
> (and should) compare only values with the same unit, but which one, if we have
> multiple? Both separately?

> On the one hand I like having the explicit unit, on the other hand we can get
> a similar result with simply using different test names for different units,
> and have them specified in the comments, for humans 🤔

Just thinking out loud here but perhaps what we want is something in the
ingestion path which either validates a schema that says "Test X must
have unit Y" or normalises the units on the way in.  I think my concern
is more on the write side than on the read side (modulo the display
stuff), or rather making sure that what's available to the read side can
be joined up.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: KCIDB: Support non-binary test outputs
  2024-08-05 11:28 ` KCIDB: Support non-binary test outputs Nikolai Kondrashov
  2024-08-05 12:33   ` Mark Brown
@ 2024-08-05 17:02   ` Bird, Tim
  2024-08-06 17:32     ` [Automated-testing] " Nikolai Kondrashov
  2024-08-16 10:03   ` [Automated-testing] KCIDB: Support non-binary test outputs V2 Nikolai Kondrashov
  2 siblings, 1 reply; 26+ messages in thread
From: Bird, Tim @ 2024-08-05 17:02 UTC (permalink / raw)
  To: Nikolai Kondrashov, syzkaller, Dmitry Vyukov, Vishal Bhoj,
	Alice Ferrazzi, automated-testing@lists.yoctoproject.org,
	Cristian Marussi, Johnson George, Veronika Kabatova,
	Guillaume Tucker, kernelci@lists.linux.dev, Don Zickus,
	Mark Brown, Philip Li, Denys Fedoryshchenko, Michael Hofmann,
	Tales da Aparecida, Aditya Nagesh, Jeny Dhruvit Sheth,
	Sachin Sant, Hambardzumyan, Minas



> -----Original Message-----
> From: Nikolai Kondrashov <spbnick@gmail.com>
> Hi everyone,
> 
> On 8/5/24 12:56 PM, Nikolai Kondrashov wrote:
> > Meanwhile I'd like to propose two small, but potentially very useful changes
> > to the I/O schema for tests:
> >
> > * Supporting non-binary outputs beyond PASS/FAIL -
> >   integers/floats/booleans/strings/etc. - useful for performance tests.
> >
> > * Supporting recording `compatible` values from the top of the device tree
> >   inside the test environment, for machines which use them - useful for
> >   correlating test results by hardware.
> >
> > Here's the corresponding schema PR:
> >
> >     https://github.com/kernelci/kcidb-io/pull/85
> >
> > I'll follow up with a separate message for each of the changes, going over the
> > details and the rationale.
> 
> The support for non-binary test outputs introduces a new field to the "test"
> objects called "value", being an "object" itself. It (and its abstract
> meaning) are supposed to work together with the "status" field. That is, it
> should only be considered when the test has actually executed. I.e. with a
> "FAIL", "ERROR", "PASS", or "DONE" status only. Normally "DONE" should be
> used, when the value is the test's output, and not an auxiliary value.

I'm not sure I'm reading this correctly, but as I understand it, this pre-supposes
that the value and the testcase are synonymous.  Is that right?  Are there
values that are not assigned to a testcase?

In my experience, many tests (particularly IO performance tests) produce
a whole lot of values (basically a matrix for combinations of different IO sizes,
IO direction (read/write), patterns (sequential vs random), and scheduling
classes).  It's quite common to have a tester select only a few values to
convert into testcases (that is, items that would cause a test to pass or fail).

> 
> Here is the (abbreviated) schema for the new field:
> 
>     "value": {
>         "type": "object",
>         "properties": {
>             "integer": {"type": "integer"},
>             "number": {"type": "number"},
>             "string": {"type": "string"},
>             "boolean": {"type": "boolean"},
>         },
>         "minProperties": 1,
>         "additionalProperties": False
>     },
> 
> The meaning of the value itself depends on the particular test, that is the
> "path" field value. Each property inside the value corresponds to a data type.
> At least one must be specified, but if more than one type property is set,
> each is considered a different representation of the *same* value, and not a
> different value. E.g. these can be specified at the same time: "integer": 1,
> "number": 1, "boolean": true, "string": "true".

I'm not sure what the intended use is for these different type properties.
Almost universally, benchmark data is expressed as numeric values (that is,
numbers).  Are these other types used to hold intermediate formats?
If so, for what reason?  I would suggest dropping the type field, and just
making them all numbers.
 
> Specifying multiple types at once can be used to assist transitioning
> test output to a different type, but in normal use only one of them
> should be supplied.
> 
> Here's an example test object using the "value" field:
> 
> {
>     "id": "redhat:2876829c98e9878766a",
>     "origin": "redhat",
>     "build_id": "redhat:387d3459ef",
>     "path":"redhat_ext4fs.performance.iops",
>     "comment": "Red Hat Ext4 FS I/O performance, IOPS"
>     "status": "DONE",
>     "value": {"integer": 57324}
> }
> 
> If the test in question has multiple values to report, then the submitting CI
> system should create (synthetic) subtests under the test node, one for each
> value. E.g. if the (imaginary) test above had random-read and random-write
> IOPS to report separately, it could've been expressed as such:
> 
> "tests": [
>     {
>         "id": "redhat:2876829c98e9878766a",
>         "origin": "redhat",
>         "build_id": "redhat:387d3459ef",
>         "path":"redhat_ext4fs.performance",
>         "comment": "Red Hat Ext4 FS I/O performance"
>         "status": "DONE"
>     },
>     {
>         "id": "redhat:2876829c98e9878766a:rriops",
>         "origin": "redhat",
>         "build_id": "redhat:387d3459ef",
>         "path":"redhat_ext4fs.performance.rriops",
>         "comment": "Red Hat Ext4 FS I/O performance, random-read IOPS"
>         "status": "DONE",
>         "value": {"integer": 97524}
>     },
>     {
>         "id": "redhat:2876829c98e9878766a:rwiops",
>         "origin": "redhat",
>         "build_id": "redhat:387d3459ef",
>         "path":"redhat_ext4fs.performance.rwiops",
>         "comment": "Red Hat Ext4 FS I/O performance, random-write IOPS"
>         "status": "DONE",
>         "value": {"integer": 46434}
>     }
> ]
> 
> This separation allows us to control complexity, while at the same time
> allowing KCIDB to analyze and track results in a more-or-less generic way.
> 
> Finally, if you need to monitor both the separate results and a combined
> result, you simply add it to the parent test. Like adding back the overall
> IOPS value to the parent test above.
> 
> Don't hesitate to send your comments/questions/objections here or in the PR!
> 
> Nick

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: KCIDB: Support non-binary test outputs
  2024-08-05 12:33   ` Mark Brown
  2024-08-05 14:26     ` Nikolai Kondrashov
@ 2024-08-05 17:11     ` Bird, Tim
  2024-08-06 17:18       ` [Automated-testing] " Nikolai Kondrashov
  1 sibling, 1 reply; 26+ messages in thread
From: Bird, Tim @ 2024-08-05 17:11 UTC (permalink / raw)
  To: Mark Brown, Nikolai Kondrashov
  Cc: syzkaller, Dmitry Vyukov, Vishal Bhoj, Alice Ferrazzi,
	automated-testing@lists.yoctoproject.org, Cristian Marussi,
	Johnson George, Veronika Kabatova, Guillaume Tucker,
	kernelci@lists.linux.dev, Don Zickus, Philip Li,
	Denys Fedoryshchenko, Michael Hofmann, Tales da Aparecida,
	Aditya Nagesh, Jeny Dhruvit Sheth, Sachin Sant,
	Hambardzumyan, Minas



> -----Original Message-----
> From: Mark Brown <broonie@kernel.org>
> Sent: Monday, August 5, 2024 6:33 AM
> To: Nikolai Kondrashov <spbnick@gmail.com>
> Cc: syzkaller <syzkaller@googlegroups.com>; Dmitry Vyukov <dvyukov@google.com>; Vishal Bhoj <vishal.bhoj@linaro.org>; Alice Ferrazzi
> <alicef@gentoo.org>; automated-testing@lists.yoctoproject.org; Cristian Marussi <cristian.marussi@arm.com>; Bird, Tim
> <Tim.Bird@sony.com>; Johnson George <Johnson.George@microsoft.com>; Veronika Kabatova <vkabatov@redhat.com>; Guillaume Tucker
> <guillaume.tucker@collabora.com>; kernelci@lists.linux.dev; Don Zickus <dzickus@redhat.com>; Philip Li <philip.li@intel.com>; Denys
> Fedoryshchenko <denys.f@collabora.com>; Michael Hofmann <mhofmann@redhat.com>; Tales da Aparecida <tdaapare@redhat.com>;
> Aditya Nagesh <adityanagesh@microsoft.com>; Jeny Dhruvit Sheth <jeny.sadadia@collabora.com>; Sachin Sant <sachinp@linux.ibm.com>;
> Hambardzumyan, Minas <minas@ti.com>
> Subject: Re: KCIDB: Support non-binary test outputs
> 
> On Mon, Aug 05, 2024 at 02:28:24PM +0300, Nikolai Kondrashov wrote:
> 
> > Here is the (abbreviated) schema for the new field:
> 
> >     "value": {
> >         "type": "object",
> >         "properties": {
> >             "integer": {"type": "integer"},
> >             "number": {"type": "number"},
> >             "string": {"type": "string"},
> >             "boolean": {"type": "boolean"},
> >         },
> >         "minProperties": 1,
> >         "additionalProperties": False
> >     },
> 
> > The meaning of the value itself depends on the particular test, that is the
> > "path" field value. Each property inside the value corresponds to a data type.
> 
> Might it be useful to directly specify units for use with "number" to
> help with normalising data between different CI systems or hardware -
> for example with boot times both seconds and miliseconds seem like
> reasonable units to use?  It might also be useful for UIs, though we
> could also do that with a separate table for the tests that they can
> query.  Perhaps I'm just worrying too much about specialist cases where
> it's likely that CI systems won't just be picking up an off the shelf
> suite that has standard units.

I'll second this.  A number of benchmarks output their values in different
units, depending on the performance of the system, and it's valuable
to be able to detect that different units are being used, in order to 
compare results effectively.  For example, some IO tests will report KB/s
on a slow machine and MB/s on a fast machine, for the same measurement.

Tests sometimes need a policy for which unit to be the canonical one for the
test (and to express reference values  in that canonical unit format).  This
might require a test results parser to do units conversion, before comparison
with reference values in order to detect testcase results.
 -- Tim



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: KCIDB: Support non-binary test outputs
  2024-08-05 16:32       ` Mark Brown
@ 2024-08-06 17:03         ` Nikolai Kondrashov
  2024-08-06 19:20           ` Mark Brown
  0 siblings, 1 reply; 26+ messages in thread
From: Nikolai Kondrashov @ 2024-08-06 17:03 UTC (permalink / raw)
  To: Mark Brown, Nikolai Kondrashov
  Cc: syzkaller, Dmitry Vyukov, Vishal Bhoj, Alice Ferrazzi,
	automated-testing, Cristian Marussi, Tim Bird, Johnson George,
	Veronika Kabatova, Guillaume Tucker, kernelci@lists.linux.dev,
	Don Zickus, Philip Li, Denys Fedoryshchenko, Michael Hofmann,
	Tales da Aparecida, Aditya Nagesh, Jeny Dhruvit Sheth,
	Sachin Sant, Hambardzumyan, Minas

On 8/5/24 7:32 PM, Mark Brown wrote:
> On Mon, Aug 05, 2024 at 05:26:52PM +0300, Nikolai Kondrashov wrote:
>> On 8/5/24 3:33 PM, Mark Brown wrote:
>>> On Mon, Aug 05, 2024 at 02:28:24PM +0300, Nikolai Kondrashov wrote:
> 
>>> Might it be useful to directly specify units for use with "number" to
>>> help with normalising data between different CI systems or hardware -
> 
>> This is a totally valid concern. We could add a "units" field, e.g. beside
>> "value". However, I'm not sure how we could use it. Sure, we can put them next
>> to the value in the dashboard, which would look nice, but then we can also put
>> them into the "comment", as my examples do, and I'm not sure if we would be
>> able to do much with them in the database.
> 
>> Even if we have the exponent separate, I'm not sure we can make use of it
>> (converting units on the fly won't work well with indices). I mean, we can
>> (and should) compare only values with the same unit, but which one, if we have
>> multiple? Both separately?
> 
>> On the one hand I like having the explicit unit, on the other hand we can get
>> a similar result with simply using different test names for different units,
>> and have them specified in the comments, for humans 🤔
> 
> Just thinking out loud here but perhaps what we want is something in the
> ingestion path which either validates a schema that says "Test X must
> have unit Y" or normalises the units on the way in.  I think my concern
> is more on the write side than on the read side (modulo the display
> stuff), or rather making sure that what's available to the read side can
> be joined up.

So far, one of the major principles of KCIDB was "what you send is what you
get back", that is, it doesn't ever modify the data submitted by CI systems
(well, except perhaps some corner cases like floating point or timestamp
precision differences). I'm not saying we should *always* keep to this
principle in the future, but it *does* simplify a lot of reasoning about how
data is processed and queried. Also, the need to preprocess submitted data is
a good sign that your schema needs improving.

In that light, I think validation is the right way here. And validating
submitter-provided units could be the right way. This is actually similar to
the situation with test "paths". Currently we're accepting everything, but our
target is to tighten that down to help correlation. Perhaps by sending
(aggregated) warnings about unknown test paths to submitters.

This looser approach allows us to admit new data to the database faster, as it
doesn't need to undergo cataloguing first. We certainly don't want to spend
time arbitrating *every* test name or a measurement unit at this point, and we
don't want to slow down adoption from CI systems, and introduction of new
tests. But I think letting them know about deviations could work. After all,
it's in the CI system interest to comply, as that improves result quality, and
raises the chance of reaching maintainers, which they're here for.

And then we need to consider how many different test paths there could be. In
the past six months the database has seen 88122 unique test paths. Yep, eighty
eight thousands. Of course, tests reporting a value would be a tiny fraction
of those, but even if it's just 1%, that's still almost nine hundred paths,
seen in six months. Most likely we can do quite well with regular expressions,
as many of those are only a bit different.

So, we can have such a catalog, and for example only validate the matching
entries, instead of requiring an entry to be there before accepting the data.
That would keep it easy to send new data, but with the catalog you would have
tighter control for stuff you care about.

However, I would leave it to the submitter to observe the correct exponent
(e.g. KB vs. MB vs. GB). We can perhaps standardize on units *without* the
metric prefix, and rely on either floating-point exponent, or larger integer
representations (like 8-byte bigints in PostgreSQL, JSON integers have no size
themselves) to handle the required ranges. This way the dashboard would be
able to display the units, *and* apply the prefix as necessary, automatically.
Or perhaps specify the prefix (or exponent) separately from the unit, so the
usual range would fit, but the dashboard could still scale the numbers on
display. OTOH, indices would be no use for separate value/exponent
representation.

So, we could start with adding a string field for the unit, and e.g. an
integer field for the exponent (10^E), both stored in the database, and
combined for display. And we could add the validation later.

Or we can bake in the exponent into the unit, and make use of indexes, but
require people to pay more attention to them.

Oooorr, we could implement as is for the start, and add units later, when we
see the situation better, based on what's actually coming in. This would help
us implement the dashboard support and regression tracking easier. As we would
only need to correlate across test paths (and other parameters, of course),
without the need to add the units to all the queries and considerations. The
submitters would be able to clarify the units in the test paths and the
comments.

And the final question we need to consider is how many people would actually
bother finding out what the units exactly are in their tests, and specifying
them?

Sorry for the wall of text :D
Thanks for reading!
Nick

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Automated-testing] KCIDB: Support non-binary test outputs
  2024-08-05 17:11     ` Bird, Tim
@ 2024-08-06 17:18       ` Nikolai Kondrashov
  0 siblings, 0 replies; 26+ messages in thread
From: Nikolai Kondrashov @ 2024-08-06 17:18 UTC (permalink / raw)
  To: Tim Bird, Mark Brown, Nikolai Kondrashov
  Cc: syzkaller, Dmitry Vyukov, Vishal Bhoj, Alice Ferrazzi,
	automated-testing@lists.yoctoproject.org, Cristian Marussi,
	Johnson George, Veronika Kabatova, Guillaume Tucker,
	kernelci@lists.linux.dev, Don Zickus, Philip Li,
	Denys Fedoryshchenko, Michael Hofmann, Tales da Aparecida,
	Aditya Nagesh, Jeny Dhruvit Sheth, Sachin Sant,
	Hambardzumyan, Minas

(Sorry, re-sending, as I lost all recipients accidentally on the last try)

Hi Tim,

Thank you for your responses!

On 8/5/24 8:11 PM, Tim Bird wrote:
>> -----Original Message-----
>> From: Mark Brown <broonie@kernel.org>
>> Sent: Monday, August 5, 2024 6:33 AM
>> To: Nikolai Kondrashov <spbnick@gmail.com>
>> On Mon, Aug 05, 2024 at 02:28:24PM +0300, Nikolai Kondrashov wrote:
>>
>>> The meaning of the value itself depends on the particular test, that is the
>>> "path" field value. Each property inside the value corresponds to a data type.
>>
>> Might it be useful to directly specify units for use with "number" to
>> help with normalising data between different CI systems or hardware -
>> for example with boot times both seconds and miliseconds seem like
>> reasonable units to use?  It might also be useful for UIs, though we
>> could also do that with a separate table for the tests that they can
>> query.  Perhaps I'm just worrying too much about specialist cases where
>> it's likely that CI systems won't just be picking up an off the shelf
>> suite that has standard units.
>
> I'll second this.  A number of benchmarks output their values in different
> units, depending on the performance of the system, and it's valuable
> to be able to detect that different units are being used, in order to
> compare results effectively.  For example, some IO tests will report KB/s
> on a slow machine and MB/s on a fast machine, for the same measurement.
>
> Tests sometimes need a policy for which unit to be the canonical one for the
> test (and to express reference values  in that canonical unit format).  This
> might require a test results parser to do units conversion, before comparison
> with reference values in order to detect testcase results.

JSON doesn't really have an integer or floating-point size, so I would prefer
that the submitting CI system stuck to a single exponent / metric prefix in
their reports, and scaled the values itself, to help us avoid (complicated)
pre-processing, and help the database indexes improve query performance. And,
after all, they know their tests best to pick the right unit/scale.

If we agree on prefix-less units, for example, then the dashboards would be
able to scale the data and apply their own prefix, as needed, automatically.

Please also see my response to Mark, for more considerations.

Thanks!
Nick


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Automated-testing] KCIDB: Support non-binary test outputs
  2024-08-05 17:02   ` Bird, Tim
@ 2024-08-06 17:32     ` Nikolai Kondrashov
  0 siblings, 0 replies; 26+ messages in thread
From: Nikolai Kondrashov @ 2024-08-06 17:32 UTC (permalink / raw)
  To: Tim Bird, Nikolai Kondrashov, syzkaller, Dmitry Vyukov,
	Vishal Bhoj, Alice Ferrazzi,
	automated-testing@lists.yoctoproject.org, Cristian Marussi,
	Johnson George, Veronika Kabatova, Guillaume Tucker,
	kernelci@lists.linux.dev, Don Zickus, Mark Brown, Philip Li,
	Denys Fedoryshchenko, Michael Hofmann, Tales da Aparecida,
	Aditya Nagesh, Jeny Dhruvit Sheth, Sachin Sant,
	Hambardzumyan, Minas

On 8/5/24 8:02 PM, Tim Bird wrote:
>> -----Original Message-----
>> From: Nikolai Kondrashov <spbnick@gmail.com>
>> On 8/5/24 12:56 PM, Nikolai Kondrashov wrote:
>> The support for non-binary test outputs introduces a new field to the "test"
>> objects called "value", being an "object" itself. It (and its abstract
>> meaning) are supposed to work together with the "status" field. That is, it
>> should only be considered when the test has actually executed. I.e. with a
>> "FAIL", "ERROR", "PASS", or "DONE" status only. Normally "DONE" should be
>> used, when the value is the test's output, and not an auxiliary value.
>
> I'm not sure I'm reading this correctly, but as I understand it, this
pre-supposes
> that the value and the testcase are synonymous.  Is that right?

Yep. But don't take the meaning of "testcase" too literally. You can report as
many "synthetic" cases as you need, to report your data. They don't actually
have to correspond to actual "testcases". View them as a tool to express your
results. Just make sure you get the agreement of anyone else reporting the
same data on how you're going to do it.

> Are there values that are not assigned to a testcase?

I don't know? I haven't seen the gamut of the data we need. However, we can
always make up a "testcase" and assign them there, if needed, as I say above.

> In my experience, many tests (particularly IO performance tests) produce
> a whole lot of values (basically a matrix for combinations of different IO
sizes,
> IO direction (read/write), patterns (sequential vs random), and scheduling
> classes).  It's quite common to have a tester select only a few values to
> convert into testcases (that is, items that would cause a test to pass or fail).

Sure. To decide which values you want reported explicitly, consider which ones
you want KCIDB to be able to track, graph, and report deviations on
(eventually). You can put the the complete data into our free-form "misc"
field, if you'd still like to be able to get to it (but not query or analyze
it within KCIDB).

>> The meaning of the value itself depends on the particular test, that is the
>> "path" field value. Each property inside the value corresponds to a data type.
>> At least one must be specified, but if more than one type property is set,
>> each is considered a different representation of the *same* value, and not a
>> different value. E.g. these can be specified at the same time: "integer": 1,
>> "number": 1, "boolean": true, "string": "true".
>
> I'm not sure what the intended use is for these different type properties.
> Almost universally, benchmark data is expressed as numeric values (that is,
> numbers).  Are these other types used to hold intermediate formats?
> If so, for what reason?  I would suggest dropping the type field, and just
> making them all numbers.

We don't really have a type field, we just have separate fields for different
types. I don't really know what types we would want. I just showed some types
there, mostly as an illustration. But I think we might need the floating-point
and the integers separately, so different precision requirements could be
satisfied. That alone is enough to require us to support different types.

Regarding the strings, I suppose some tests (perhaps not performance ones)
could produce some discrete categorical outputs, and then we would be able to
graph and track them. But again, I haven't actually seen any.

The boolean was added just to completely cover the atomic types, although I
could probably think of a usecase, we can easily drop it, as well as the
strings, and add them when necessary, no problem.

Nick

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Automated-testing] KCIDB: Test schema enhancements
  2024-08-05 12:28 ` KCIDB: Test schema enhancements Nikolai Kondrashov
@ 2024-08-06 17:39   ` Nikolai Kondrashov
  2024-08-06 19:26     ` Mark Brown
  0 siblings, 1 reply; 26+ messages in thread
From: Nikolai Kondrashov @ 2024-08-06 17:39 UTC (permalink / raw)
  To: syzkaller, Dmitry Vyukov, Vishal Bhoj, Alice Ferrazzi,
	automated-testing, Cristian Marussi, Tim Bird, Johnson George,
	Veronika Kabatova, kernelci@lists.linux.dev, Don Zickus,
	Mark Brown, Philip Li, Denys Fedoryshchenko, Michael Hofmann,
	Tales da Aparecida, Aditya Nagesh, Jeny Dhruvit Sheth,
	Sachin Sant, Hambardzumyan, Minas

On 8/5/24 3:28 PM, Nikolai Kondrashov wrote:
> On 8/5/24 12:56 PM, Nikolai Kondrashov wrote:
>> Meanwhile I'd like to propose two small, but potentially very useful changes
>> to the I/O schema for tests:
>>
>> * Supporting non-binary outputs beyond PASS/FAIL -
>>   integers/floats/booleans/strings/etc. - useful for performance tests.
>>
>> * Supporting recording `compatible` values from the top of the device tree
>>   inside the test environment, for machines which use them - useful for
>>   correlating test results by hardware.
>>
>> Here's the corresponding schema PR:
>>
>>     https://github.com/kernelci/kcidb-io/pull/85
>>
>> I'll follow up with a separate message for each of the changes, going over the
>> details and the rationale.
> 
> The "compatible" property in the root of a device tree specifies the device
> vendor, the device (board) model, as well as often the device SoC and family.
> These are generally encoded as several strings, ordered from most to least
> specific, each potentially containing multiple parts separated by commas.
> E.g.:
> 
>     "ti,omap3-beagleboard", "ti,omap3450", "ti,omap3"

Just wanted to highlight, that we also need feedback on this. Thank you!

Nick


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: KCIDB: Support non-binary test outputs
  2024-08-06 17:03         ` Nikolai Kondrashov
@ 2024-08-06 19:20           ` Mark Brown
  2024-08-08 18:05             ` Nikolai Kondrashov
  0 siblings, 1 reply; 26+ messages in thread
From: Mark Brown @ 2024-08-06 19:20 UTC (permalink / raw)
  To: Nikolai Kondrashov
  Cc: Nikolai Kondrashov, syzkaller, Dmitry Vyukov, Vishal Bhoj,
	Alice Ferrazzi, automated-testing, Cristian Marussi, Tim Bird,
	Johnson George, Veronika Kabatova, Guillaume Tucker,
	kernelci@lists.linux.dev, Don Zickus, Philip Li,
	Denys Fedoryshchenko, Michael Hofmann, Tales da Aparecida,
	Aditya Nagesh, Jeny Dhruvit Sheth, Sachin Sant,
	Hambardzumyan, Minas

[-- Attachment #1: Type: text/plain, Size: 2901 bytes --]

On Tue, Aug 06, 2024 at 08:03:00PM +0300, Nikolai Kondrashov wrote:

> In that light, I think validation is the right way here. And validating
> submitter-provided units could be the right way. This is actually similar to
> the situation with test "paths". Currently we're accepting everything, but our
> target is to tighten that down to help correlation. Perhaps by sending
> (aggregated) warnings about unknown test paths to submitters.

TBH as a submitter getting specific stuff back immediately (or at least
the option for it) is really helpful.

> This looser approach allows us to admit new data to the database faster, as it
> doesn't need to undergo cataloguing first. We certainly don't want to spend
> time arbitrating *every* test name or a measurement unit at this point, and we
> don't want to slow down adoption from CI systems, and introduction of new
> tests. But I think letting them know about deviations could work. After all,
> it's in the CI system interest to comply, as that improves result quality, and
> raises the chance of reaching maintainers, which they're here for.

Perhaps per test schemas of some kind (not sure how exactly you'd go
about doing it) could help here, if the test is unknown then just let it
in but if it's a test we know about and we've defined the units for then
enforce those units?  That way there's the looser stuff and reporting
that shows what we could work on standardising, and things that have
been standardised are hopefully going to be more joined up?

> However, I would leave it to the submitter to observe the correct exponent
> (e.g. KB vs. MB vs. GB). We can perhaps standardize on units *without* the
> metric prefix, and rely on either floating-point exponent, or larger integer
> representations (like 8-byte bigints in PostgreSQL, JSON integers have no size
> themselves) to handle the required ranges. This way the dashboard would be
> able to display the units, *and* apply the prefix as necessary, automatically.
> Or perhaps specify the prefix (or exponent) separately from the unit, so the
> usual range would fit, but the dashboard could still scale the numbers on
> display. OTOH, indices would be no use for separate value/exponent
> representation.

Putting the exponent in as a number does seem like it'd be much more
helpful for machine processing.

> And the final question we need to consider is how many people would actually
> bother finding out what the units exactly are in their tests, and specifying
> them?

I suspect there's going to be a fair amount of stuff where there's a
fairly clear specific unit for one reason or other that's commonly used
when talking about the test (eg, things like I/O benchmarks, run times,
or temperatures) so it'll be immediately obvious and also a bunch of
things where the number is just a number for the benchmark and nobody
cares about whatever the units actually are anyway.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Automated-testing] KCIDB: Test schema enhancements
  2024-08-06 17:39   ` [Automated-testing] " Nikolai Kondrashov
@ 2024-08-06 19:26     ` Mark Brown
  2024-08-07 10:46       ` Nikolai Kondrashov
  0 siblings, 1 reply; 26+ messages in thread
From: Mark Brown @ 2024-08-06 19:26 UTC (permalink / raw)
  To: Nikolai Kondrashov
  Cc: syzkaller, Dmitry Vyukov, Vishal Bhoj, Alice Ferrazzi,
	automated-testing, Cristian Marussi, Tim Bird, Johnson George,
	Veronika Kabatova, kernelci@lists.linux.dev, Don Zickus,
	Philip Li, Denys Fedoryshchenko, Michael Hofmann,
	Tales da Aparecida, Aditya Nagesh, Jeny Dhruvit Sheth,
	Sachin Sant, Hambardzumyan, Minas

[-- Attachment #1: Type: text/plain, Size: 692 bytes --]

On Tue, Aug 06, 2024 at 08:39:13PM +0300, Nikolai Kondrashov wrote:
> On 8/5/24 3:28 PM, Nikolai Kondrashov wrote:

> > The "compatible" property in the root of a device tree specifies the device
> > vendor, the device (board) model, as well as often the device SoC and family.
> > These are generally encoded as several strings, ordered from most to least
> > specific, each potentially containing multiple parts separated by commas.
> > E.g.:

> >     "ti,omap3-beagleboard", "ti,omap3450", "ti,omap3"

> Just wanted to highlight, that we also need feedback on this. Thank you!

It looks good to me - I suspect it's just uncontroversial and everyone
thinks it makes sense?  

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Automated-testing] KCIDB: Test schema enhancements
  2024-08-06 19:26     ` Mark Brown
@ 2024-08-07 10:46       ` Nikolai Kondrashov
  0 siblings, 0 replies; 26+ messages in thread
From: Nikolai Kondrashov @ 2024-08-07 10:46 UTC (permalink / raw)
  To: Mark Brown, Nikolai Kondrashov
  Cc: syzkaller, Dmitry Vyukov, Vishal Bhoj, Alice Ferrazzi,
	automated-testing, Cristian Marussi, Tim Bird, Johnson George,
	Veronika Kabatova, kernelci@lists.linux.dev, Don Zickus,
	Philip Li, Denys Fedoryshchenko, Michael Hofmann,
	Tales da Aparecida, Aditya Nagesh, Jeny Dhruvit Sheth,
	Sachin Sant, Hambardzumyan, Minas

On 8/6/24 10:26 PM, Mark Brown wrote:
> On Tue, Aug 06, 2024 at 08:39:13PM +0300, Nikolai Kondrashov wrote:
>> On 8/5/24 3:28 PM, Nikolai Kondrashov wrote:
> 
>>> The "compatible" property in the root of a device tree specifies the device
>>> vendor, the device (board) model, as well as often the device SoC and family.
>>> These are generally encoded as several strings, ordered from most to least
>>> specific, each potentially containing multiple parts separated by commas.
>>> E.g.:
> 
>>>     "ti,omap3-beagleboard", "ti,omap3450", "ti,omap3"
> 
>> Just wanted to highlight, that we also need feedback on this. Thank you!
> 
> It looks good to me - I suspect it's just uncontroversial and everyone
> thinks it makes sense?  

I like this kind of feedback, yes :D Thanks, Mark!

Nick


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: KCIDB: Support non-binary test outputs
  2024-08-06 19:20           ` Mark Brown
@ 2024-08-08 18:05             ` Nikolai Kondrashov
  2024-08-08 19:47               ` Mark Brown
  0 siblings, 1 reply; 26+ messages in thread
From: Nikolai Kondrashov @ 2024-08-08 18:05 UTC (permalink / raw)
  To: Mark Brown
  Cc: Nikolai Kondrashov, syzkaller, Dmitry Vyukov, Vishal Bhoj,
	Alice Ferrazzi, automated-testing, Cristian Marussi, Tim Bird,
	Johnson George, Veronika Kabatova, Guillaume Tucker,
	kernelci@lists.linux.dev, Don Zickus, Philip Li,
	Denys Fedoryshchenko, Michael Hofmann, Tales da Aparecida,
	Aditya Nagesh, Jeny Dhruvit Sheth, Sachin Sant,
	Hambardzumyan, Minas

On 8/6/24 10:20 PM, Mark Brown wrote:
> On Tue, Aug 06, 2024 at 08:03:00PM +0300, Nikolai Kondrashov wrote:
>
>> In that light, I think validation is the right way here. And validating
>> submitter-provided units could be the right way. This is actually similar to
>> the situation with test "paths". Currently we're accepting everything, but our
>> target is to tighten that down to help correlation. Perhaps by sending
>> (aggregated) warnings about unknown test paths to submitters.
>
> TBH as a submitter getting specific stuff back immediately (or at least
> the option for it) is really helpful.

Do you mean that you would like to know if you submitted something wrong
immediately?

Could you elaborate a bit here?

>> This looser approach allows us to admit new data to the database faster, as it
>> doesn't need to undergo cataloguing first. We certainly don't want to spend
>> time arbitrating *every* test name or a measurement unit at this point, and we
>> don't want to slow down adoption from CI systems, and introduction of new
>> tests. But I think letting them know about deviations could work. After all,
>> it's in the CI system interest to comply, as that improves result quality, and
>> raises the chance of reaching maintainers, which they're here for.
>
> Perhaps per test schemas of some kind (not sure how exactly you'd go
> about doing it) could help here, if the test is unknown then just let it
> in but if it's a test we know about and we've defined the units for then
> enforce those units?  That way there's the looser stuff and reporting
> that shows what we could work on standardising, and things that have
> been standardised are hopefully going to be more joined up?

Yes, that's the approach I was thinking about.

I'm not sure if we would be able to implement this kind of checking in JSON
schema, not likely. But we can certainly always implement a higher-order
check, which we can run on database data after it's updated.

>> However, I would leave it to the submitter to observe the correct exponent
>> (e.g. KB vs. MB vs. GB). We can perhaps standardize on units *without* the
>> metric prefix, and rely on either floating-point exponent, or larger integer
>> representations (like 8-byte bigints in PostgreSQL, JSON integers have no size
>> themselves) to handle the required ranges. This way the dashboard would be
>> able to display the units, *and* apply the prefix as necessary, automatically.
>> Or perhaps specify the prefix (or exponent) separately from the unit, so the
>> usual range would fit, but the dashboard could still scale the numbers on
>> display. OTOH, indices would be no use for separate value/exponent
>> representation.
>
> Putting the exponent in as a number does seem like it'd be much more
> helpful for machine processing.

It's helpful for normalizing to a single exponent, so data can be compared,
but that's not something that a database can do fast. So, not helpful in the
end.

>> And the final question we need to consider is how many people would actually
>> bother finding out what the units exactly are in their tests, and specifying
>> them?
>
> I suspect there's going to be a fair amount of stuff where there's a
> fairly clear specific unit for one reason or other that's commonly used
> when talking about the test (eg, things like I/O benchmarks, run times,
> or temperatures) so it'll be immediately obvious and also a bunch of
> things where the number is just a number for the benchmark and nobody
> cares about whatever the units actually are anyway.

Yeah, if we make the unit optional, and validate it only if it's specified in
requirements, then we can both have the cake and eat it too. That is have more
alignment where we want, and more freedom, where we want to experiment.

I'll post a new version of the schema with the optional unit string, which
would be expecting people to maintain it uniform, use wider data types, so
they don't need to switch the prefix/exponent, but would keep them constant,
and we can add validation later.

Thank you everyone for the feedback!

Nick


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: KCIDB: Support non-binary test outputs
  2024-08-08 18:05             ` Nikolai Kondrashov
@ 2024-08-08 19:47               ` Mark Brown
  2024-08-15 15:29                 ` Nikolai Kondrashov
  0 siblings, 1 reply; 26+ messages in thread
From: Mark Brown @ 2024-08-08 19:47 UTC (permalink / raw)
  To: Nikolai Kondrashov
  Cc: Nikolai Kondrashov, syzkaller, Dmitry Vyukov, Vishal Bhoj,
	Alice Ferrazzi, automated-testing, Cristian Marussi, Tim Bird,
	Johnson George, Veronika Kabatova, Guillaume Tucker,
	kernelci@lists.linux.dev, Don Zickus, Philip Li,
	Denys Fedoryshchenko, Michael Hofmann, Tales da Aparecida,
	Aditya Nagesh, Jeny Dhruvit Sheth, Sachin Sant,
	Hambardzumyan, Minas

[-- Attachment #1: Type: text/plain, Size: 946 bytes --]

On Thu, Aug 08, 2024 at 09:05:20PM +0300, Nikolai Kondrashov wrote:
> On 8/6/24 10:20 PM, Mark Brown wrote:

> > TBH as a submitter getting specific stuff back immediately (or at least
> > the option for it) is really helpful.

> Do you mean that you would like to know if you submitted something wrong
> immediately?

> Could you elaborate a bit here?

The whole workflow where you push data in and then check to see if it
appeared in the dashboard can be a bit obscure if you're trying to get
things right.

> > Putting the exponent in as a number does seem like it'd be much more
> > helpful for machine processing.

> It's helpful for normalizing to a single exponent, so data can be compared,
> but that's not something that a database can do fast. So, not helpful in the
> end.

It might be helpful on the read side though (ie, doing it as part of
rendering rather than in the database itself)?  Not so good for queries
but useful for UIs.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: KCIDB: Support non-binary test outputs
  2024-08-08 19:47               ` Mark Brown
@ 2024-08-15 15:29                 ` Nikolai Kondrashov
  0 siblings, 0 replies; 26+ messages in thread
From: Nikolai Kondrashov @ 2024-08-15 15:29 UTC (permalink / raw)
  To: Mark Brown
  Cc: Nikolai Kondrashov, syzkaller, Dmitry Vyukov, Vishal Bhoj,
	Alice Ferrazzi, automated-testing, Cristian Marussi, Tim Bird,
	Johnson George, Veronika Kabatova, Guillaume Tucker,
	kernelci@lists.linux.dev, Don Zickus, Philip Li,
	Denys Fedoryshchenko, Michael Hofmann, Tales da Aparecida,
	Aditya Nagesh, Jeny Dhruvit Sheth, Sachin Sant,
	Hambardzumyan, Minas

On 8/8/24 10:47 PM, Mark Brown wrote:
> On Thu, Aug 08, 2024 at 09:05:20PM +0300, Nikolai Kondrashov wrote:
>> On 8/6/24 10:20 PM, Mark Brown wrote:
> 
>>> TBH as a submitter getting specific stuff back immediately (or at least
>>> the option for it) is really helpful.
> 
>> Do you mean that you would like to know if you submitted something wrong
>> immediately?
> 
>> Could you elaborate a bit here?
> 
> The whole workflow where you push data in and then check to see if it
> appeared in the dashboard can be a bit obscure if you're trying to get
> things right.

Makes sense. How would you prefer it to work? Throw an exception with details
of the problem when submitting using the Python API, and abort the
kcidb-submit with an error message when using command-line tools?

Do you validate your data before sending? We can add something on top of the
schema there.

I think there would always be "soft failures" of one nature or another, that
shouldn't prevent the data getting in, but would warrant eventual resolution.
Would you like to receive them as e.g. a daily digest, via email?

>>> Putting the exponent in as a number does seem like it'd be much more
>>> helpful for machine processing.
> 
>> It's helpful for normalizing to a single exponent, so data can be compared,
>> but that's not something that a database can do fast. So, not helpful in the
>> end.
> 
> It might be helpful on the read side though (ie, doing it as part of
> rendering rather than in the database itself)?  Not so good for queries
> but useful for UIs.

OK, I'll send a new version of the schema with units and exponent added, and
we'll just see how it goes.

Thanks for all the comments!
Nick


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Automated-testing] KCIDB: Support non-binary test outputs V2
  2024-08-05 11:28 ` KCIDB: Support non-binary test outputs Nikolai Kondrashov
  2024-08-05 12:33   ` Mark Brown
  2024-08-05 17:02   ` Bird, Tim
@ 2024-08-16 10:03   ` Nikolai Kondrashov
  2024-08-16 12:58     ` Nikolai Kondrashov
  2024-08-16 14:05     ` [Automated-testing] KCIDB: Support non-binary test outputs V3 Nikolai Kondrashov
  2 siblings, 2 replies; 26+ messages in thread
From: Nikolai Kondrashov @ 2024-08-16 10:03 UTC (permalink / raw)
  To: syzkaller, Dmitry Vyukov, Vishal Bhoj, Alice Ferrazzi,
	automated-testing, Cristian Marussi, Tim Bird, Johnson George,
	Veronika Kabatova, Guillaume Tucker, kernelci@lists.linux.dev,
	Don Zickus, Mark Brown, Philip Li, Denys Fedoryshchenko,
	Michael Hofmann, Tales da Aparecida, Aditya Nagesh,
	Jeny Dhruvit Sheth, Sachin Sant, Hambardzumyan, Minas

On 8/5/24 2:28 PM, Nikolai Kondrashov wrote:
>> I'll follow up with a separate message for each of the changes, going over the
>> details and the rationale.
> 
> The support for non-binary test outputs introduces a new field to the "test"
> objects called "value", being an "object" itself. It (and its abstract
> meaning) are supposed to work together with the "status" field. That is, it
> should only be considered when the test has actually executed. I.e. with a
> "FAIL", "ERROR", "PASS", or "DONE" status only. Normally "DONE" should be
> used, when the value is the test's output, and not an auxiliary value.

Alright, I redid support for non-binary test outputs:

https://github.com/kernelci/kcidb-io/pull/85/commits/040a20db407f2592c28c3797cec9e4118221af9c

First, I dropped support for all the different types, and left only the
(signed) integer, requiring the receiving system to allocate at least 64 bits
to it. That integer is the only required field for the numeric output.

I also added the 10-based "exponent" field, to support "floating point"
numbers, although I suspect we would prefer fixed-point numbers (consistent
exponent) instead, as that makes the queries easier, at least for the start.

Then there's the "unit" field, which is just a string, and if "exponent" is
specified, then the "unit" is assumed to not contain any prefixes, and they
are generated on display, with the value appropriately scaled (that SQL
function will be interesting to write).

Finally, there's the "binary" field, and if it's true, then a binary prefix is
generated instead. That is, Ki, Mi, Gi, and so on. Naturally it's only
considered when both "exponent" and "unit" are specified.

Here's the abbreviated schema:

    "number": {
        "type": "object",
        "properties": {
            "value":    {"type": "integer"},
            "unit":     {"type": "string"},
            "exponent": {"type": "integer"},
            "binary":   {"type": "boolean"}
        },
        "required": ["value"],
        "additionalProperties": False
    }

The commit linked above has docs and examples. I'll reproduce the latter here
in shorter form:

    "value": 42,
    # Display: 42

    "value": 314159,
    "exponent": -5,
    # Display: 3.14159

    "value": 720,
    "unit": "KB",
    # Display: 720 KB

    "value": 160,
    "unit": "s",
    "exponent": -9,
    # Display: 160 ns

    "value": 512,
    "unit": "B",
    "exponent": 3,
    "binary": True,
    # Display: 500 KiB

Tell me what you think, especially Tim, and Mark!

Thank you.
Nick

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Automated-testing] KCIDB: Support non-binary test outputs V2
  2024-08-16 10:03   ` [Automated-testing] KCIDB: Support non-binary test outputs V2 Nikolai Kondrashov
@ 2024-08-16 12:58     ` Nikolai Kondrashov
  2024-08-16 14:05     ` [Automated-testing] KCIDB: Support non-binary test outputs V3 Nikolai Kondrashov
  1 sibling, 0 replies; 26+ messages in thread
From: Nikolai Kondrashov @ 2024-08-16 12:58 UTC (permalink / raw)
  To: syzkaller, Dmitry Vyukov, Vishal Bhoj, Alice Ferrazzi,
	automated-testing, Cristian Marussi, Tim Bird, Johnson George,
	Veronika Kabatova, Guillaume Tucker, kernelci@lists.linux.dev,
	Don Zickus, Mark Brown, Philip Li, Denys Fedoryshchenko,
	Michael Hofmann, Tales da Aparecida, Aditya Nagesh,
	Jeny Dhruvit Sheth, Sachin Sant, Hambardzumyan, Minas

On 8/16/24 1:03 PM, Nikolai Kondrashov wrote:
> On 8/5/24 2:28 PM, Nikolai Kondrashov wrote:
>>> I'll follow up with a separate message for each of the changes, going over the
>>> details and the rationale.
>>
>> The support for non-binary test outputs introduces a new field to the "test"
>> objects called "value", being an "object" itself. It (and its abstract
>> meaning) are supposed to work together with the "status" field. That is, it
>> should only be considered when the test has actually executed. I.e. with a
>> "FAIL", "ERROR", "PASS", or "DONE" status only. Normally "DONE" should be
>> used, when the value is the test's output, and not an auxiliary value.
> 
> Alright, I redid support for non-binary test outputs:
> 
> https://github.com/kernelci/kcidb-io/pull/85/commits/040a20db407f2592c28c3797cec9e4118221af9c

Ah, hold it. I think I have a better idea.
Stand by for a new version!

Nick


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Automated-testing] KCIDB: Support non-binary test outputs V3
  2024-08-16 10:03   ` [Automated-testing] KCIDB: Support non-binary test outputs V2 Nikolai Kondrashov
  2024-08-16 12:58     ` Nikolai Kondrashov
@ 2024-08-16 14:05     ` Nikolai Kondrashov
  2024-08-16 16:43       ` Bird, Tim
  2024-08-19  9:50       ` Nikolai Kondrashov
  1 sibling, 2 replies; 26+ messages in thread
From: Nikolai Kondrashov @ 2024-08-16 14:05 UTC (permalink / raw)
  To: syzkaller, Dmitry Vyukov, Vishal Bhoj, Alice Ferrazzi,
	automated-testing, Cristian Marussi, Tim Bird, Johnson George,
	Veronika Kabatova, Guillaume Tucker, kernelci@lists.linux.dev,
	Don Zickus, Mark Brown, Philip Li, Denys Fedoryshchenko,
	Michael Hofmann, Tales da Aparecida, Aditya Nagesh,
	Jeny Dhruvit Sheth, Sachin Sant, Hambardzumyan, Minas

On 8/16/24 1:03 PM, Nikolai Kondrashov wrote:
> On 8/5/24 2:28 PM, Nikolai Kondrashov wrote:
>>> I'll follow up with a separate message for each of the changes, going over the
>>> details and the rationale.
>>
>> The support for non-binary test outputs introduces a new field to the "test"
>> objects called "value", being an "object" itself. It (and its abstract
>> meaning) are supposed to work together with the "status" field. That is, it
>> should only be considered when the test has actually executed. I.e. with a
>> "FAIL", "ERROR", "PASS", or "DONE" status only. Normally "DONE" should be
>> used, when the value is the test's output, and not an auxiliary value.
> 
> Alright, I redid support for non-binary test outputs:
> 
> https://github.com/kernelci/kcidb-io/pull/85/commits/040a20db407f2592c28c3797cec9e4118221af9c

Aaand, redone again:

https://github.com/kernelci/kcidb-io/pull/85/commits/86b75f6bf6e4f74e594a301382f2b77cfe4cbbd9

I switched the value from integer to floating-point, as that allows us to get
rid of the exponent field, at the cost of a small accuracy loss. The storage
requirements are still 64 bits.

These values are intended for analysis and tracking first of all, and small
inaccuracy would have little effect on them. If you want to store the exact
measured value, you can put it into the "misc" field.

OTOH, this makes it much easier and faster to run queries on the values.

Here's the new (abbreviated schema):

    "number": {
        "type": "object",
        "properties": {
            "value": {"type": "number"},
            "unit": {"type": "string"},
            "prefix": {
                "type": "string",
                "enum": ["metric", "binary"],
            },
        },
        "required": ["value"],
        "additionalProperties": False,
    }

And the rewritten examples:

    "value": 42,
    # Display: 42

    "value": 3.14159,
    # Display: 3.14159

    "value": 720,
    "unit": "KB",
    # Display: 720 KB

    "value": 145000,
    "prefix": "metric"
    # Display: 145 K

    "value": 1.6e-7,
    "unit": "s",
    "prefix": "metric",
    # Display: 160 ns

    "value": 5.12e5,
    "unit": "B",
    "prefix": "binary",
    # Display: 500 KiB

Send comments!
Thank you.
Nick

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [Automated-testing] KCIDB: Support non-binary test outputs V3
  2024-08-16 14:05     ` [Automated-testing] KCIDB: Support non-binary test outputs V3 Nikolai Kondrashov
@ 2024-08-16 16:43       ` Bird, Tim
  2024-08-16 18:07         ` Nikolai Kondrashov
  2024-08-19  9:50       ` Nikolai Kondrashov
  1 sibling, 1 reply; 26+ messages in thread
From: Bird, Tim @ 2024-08-16 16:43 UTC (permalink / raw)
  To: Nikolai Kondrashov, syzkaller, Dmitry Vyukov, Vishal Bhoj,
	Alice Ferrazzi, automated-testing@lists.yoctoproject.org,
	Cristian Marussi, Johnson George, Veronika Kabatova,
	Guillaume Tucker, kernelci@lists.linux.dev, Don Zickus,
	Mark Brown, Philip Li, Denys Fedoryshchenko, Michael Hofmann,
	Tales da Aparecida, Aditya Nagesh, Jeny Dhruvit Sheth,
	Sachin Sant, Hambardzumyan, Minas



> -----Original Message-----
> From: automated-testing@lists.yoctoproject.org <automated-testing@lists.yoctoproject.org> On Behalf Of Nikolai Kondrashov
> On 8/16/24 1:03 PM, Nikolai Kondrashov wrote:
> > On 8/5/24 2:28 PM, Nikolai Kondrashov wrote:
> >>> I'll follow up with a separate message for each of the changes, going over the
> >>> details and the rationale.
> >>
> >> The support for non-binary test outputs introduces a new field to the "test"
> >> objects called "value", being an "object" itself. It (and its abstract
> >> meaning) are supposed to work together with the "status" field. That is, it
> >> should only be considered when the test has actually executed. I.e. with a
> >> "FAIL", "ERROR", "PASS", or "DONE" status only. Normally "DONE" should be
> >> used, when the value is the test's output, and not an auxiliary value.
> >
> > Alright, I redid support for non-binary test outputs:
> >
> > https://github.com/kernelci/kcidb-io/pull/85/commits/040a20db407f2592c28c3797cec9e4118221af9c
> 
> Aaand, redone again:
> 
> https://github.com/kernelci/kcidb-io/pull/85/commits/86b75f6bf6e4f74e594a301382f2b77cfe4cbbd9
> 
> I switched the value from integer to floating-point, as that allows us to get
> rid of the exponent field, at the cost of a small accuracy loss. The storage
> requirements are still 64 bits.
> 
> These values are intended for analysis and tracking first of all, and small
> inaccuracy would have little effect on them. If you want to store the exact
> measured value, you can put it into the "misc" field.
> 
> OTOH, this makes it much easier and faster to run queries on the values.
> 
> Here's the new (abbreviated schema):
> 
>     "number": {
>         "type": "object",
>         "properties": {
>             "value": {"type": "number"},
>             "unit": {"type": "string"},
>             "prefix": {
>                 "type": "string",
>                 "enum": ["metric", "binary"],

What is the purpose of the metric/binary enum?  Is it to disambiguate the meaning
of K and M in unit prefixes?  Is it only for display?

>             },
>         },
>         "required": ["value"],
>         "additionalProperties": False,
>     }
> 
> And the rewritten examples:
> 
>     "value": 42,
>     # Display: 42
> 
>     "value": 3.14159,
>     # Display: 3.14159
> 
>     "value": 720,
>     "unit": "KB",
>     # Display: 720 KB
> 
>     "value": 145000,
>     "prefix": "metric"
>     # Display: 145 K
> 
>     "value": 1.6e-7,
>     "unit": "s",
>     "prefix": "metric",
>     # Display: 160 ns
> 
>     "value": 5.12e5,
>     "unit": "B",
>     "prefix": "binary",
>     # Display: 500 KiB


Looks good to me!!
 -- Tim


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Automated-testing] KCIDB: Support non-binary test outputs V3
  2024-08-16 16:43       ` Bird, Tim
@ 2024-08-16 18:07         ` Nikolai Kondrashov
  0 siblings, 0 replies; 26+ messages in thread
From: Nikolai Kondrashov @ 2024-08-16 18:07 UTC (permalink / raw)
  To: Tim Bird, Nikolai Kondrashov, syzkaller, Dmitry Vyukov,
	Vishal Bhoj, Alice Ferrazzi,
	automated-testing@lists.yoctoproject.org, Cristian Marussi,
	Johnson George, Veronika Kabatova, Guillaume Tucker,
	kernelci@lists.linux.dev, Don Zickus, Mark Brown, Philip Li,
	Denys Fedoryshchenko, Michael Hofmann, Tales da Aparecida,
	Aditya Nagesh, Jeny Dhruvit Sheth, Sachin Sant,
	Hambardzumyan, Minas

On 8/16/24 7:43 PM, Tim Bird wrote:
>> -----Original Message-----
>> From: automated-testing@lists.yoctoproject.org <automated-testing@lists.yoctoproject.org> On Behalf Of Nikolai Kondrashov
>> On 8/16/24 1:03 PM, Nikolai Kondrashov wrote:
>>
>>     "number": {
>>         "type": "object",
>>         "properties": {
>>             "value": {"type": "number"},
>>             "unit": {"type": "string"},
>>             "prefix": {
>>                 "type": "string",
>>                 "enum": ["metric", "binary"],
> 
> What is the purpose of the metric/binary enum?  Is it to disambiguate the meaning
> of K and M in unit prefixes?  Is it only for display?

Yep! Only for display. For those people who prefer **real** kilobytes :D

Taking the previous example:

    "value": 5.12e5,
    "unit": "B",
    "prefix": "binary",
    # Display: 500 KiB

And changing the prefix to "metric", produces this:

    "value": 5.12e5,
    "unit": "B",
    "prefix": "metric",
    # Display: 512 KB

Which could make some old-school people feel robbed.

>> And the rewritten examples:
>>
>>     "value": 42,
>>     # Display: 42
>>
>>     "value": 3.14159,
>>     # Display: 3.14159
>>
>>     "value": 720,
>>     "unit": "KB",
>>     # Display: 720 KB
>>
>>     "value": 145000,
>>     "prefix": "metric"
>>     # Display: 145 K
>>
>>     "value": 1.6e-7,
>>     "unit": "s",
>>     "prefix": "metric",
>>     # Display: 160 ns
>>
>>     "value": 5.12e5,
>>     "unit": "B",
>>     "prefix": "binary",
>>     # Display: 500 KiB
> 
> 
> Looks good to me!!

Awesome!

Thanks, Tim.
Nick


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Automated-testing] KCIDB: Support non-binary test outputs V3
  2024-08-16 14:05     ` [Automated-testing] KCIDB: Support non-binary test outputs V3 Nikolai Kondrashov
  2024-08-16 16:43       ` Bird, Tim
@ 2024-08-19  9:50       ` Nikolai Kondrashov
  2024-08-26 10:37         ` Nikolai Kondrashov
  1 sibling, 1 reply; 26+ messages in thread
From: Nikolai Kondrashov @ 2024-08-19  9:50 UTC (permalink / raw)
  To: syzkaller, Dmitry Vyukov, Vishal Bhoj, Alice Ferrazzi,
	automated-testing, Cristian Marussi, Tim Bird, Johnson George,
	Veronika Kabatova, kernelci@lists.linux.dev, Don Zickus,
	Mark Brown, Philip Li, Denys Fedoryshchenko, Michael Hofmann,
	Tales da Aparecida, Aditya Nagesh, Jeny Dhruvit Sheth,
	Sachin Sant, Hambardzumyan, Minas

On 8/16/24 5:05 PM, Nikolai Kondrashov wrote:
> Aaand, redone again:
> 
> https://github.com/kernelci/kcidb-io/pull/85/commits/86b75f6bf6e4f74e594a301382f2b77cfe4cbbd9
> 
> I switched the value from integer to floating-point, as that allows us to get
> rid of the exponent field, at the cost of a small accuracy loss. The storage
> requirements are still 64 bits.

If there are no more objections, I'm going to merge this on Wednesday, Aug 21.

I have three more additions to the schema in the pipeline, sending them soon.

Nick

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Automated-testing] KCIDB: Support non-binary test outputs V3
  2024-08-19  9:50       ` Nikolai Kondrashov
@ 2024-08-26 10:37         ` Nikolai Kondrashov
  0 siblings, 0 replies; 26+ messages in thread
From: Nikolai Kondrashov @ 2024-08-26 10:37 UTC (permalink / raw)
  To: syzkaller, Dmitry Vyukov, Vishal Bhoj, Alice Ferrazzi,
	automated-testing, Cristian Marussi, Tim Bird, Johnson George,
	Veronika Kabatova, kernelci@lists.linux.dev, Don Zickus,
	Mark Brown, Philip Li, Denys Fedoryshchenko, Michael Hofmann,
	Tales da Aparecida, Aditya Nagesh, Jeny Dhruvit Sheth,
	Sachin Sant, Hambardzumyan, Minas

On 8/19/24 12:50 PM, Nikolai Kondrashov wrote:
> On 8/16/24 5:05 PM, Nikolai Kondrashov wrote:
>> Aaand, redone again:
>>
>> https://github.com/kernelci/kcidb-io/pull/85/commits/86b75f6bf6e4f74e594a301382f2b77cfe4cbbd9
>>
>> I switched the value from integer to floating-point, as that allows us to get
>> rid of the exponent field, at the cost of a small accuracy loss. The storage
>> requirements are still 64 bits.
> 
> If there are no more objections, I'm going to merge this on Wednesday, Aug 21.

Support for this (and the "compatible" field) has been merged and deployed
last week. The sooner you send us your data, the sooner we'll figure out how
to process it!

Thank you everyone for the comments!
Nick


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2024-08-26 10:37 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-05  9:56 KCIDB: Test schema enhancements Nikolai Kondrashov
2024-08-05 11:28 ` KCIDB: Support non-binary test outputs Nikolai Kondrashov
2024-08-05 12:33   ` Mark Brown
2024-08-05 14:26     ` Nikolai Kondrashov
2024-08-05 16:32       ` Mark Brown
2024-08-06 17:03         ` Nikolai Kondrashov
2024-08-06 19:20           ` Mark Brown
2024-08-08 18:05             ` Nikolai Kondrashov
2024-08-08 19:47               ` Mark Brown
2024-08-15 15:29                 ` Nikolai Kondrashov
2024-08-05 17:11     ` Bird, Tim
2024-08-06 17:18       ` [Automated-testing] " Nikolai Kondrashov
2024-08-05 17:02   ` Bird, Tim
2024-08-06 17:32     ` [Automated-testing] " Nikolai Kondrashov
2024-08-16 10:03   ` [Automated-testing] KCIDB: Support non-binary test outputs V2 Nikolai Kondrashov
2024-08-16 12:58     ` Nikolai Kondrashov
2024-08-16 14:05     ` [Automated-testing] KCIDB: Support non-binary test outputs V3 Nikolai Kondrashov
2024-08-16 16:43       ` Bird, Tim
2024-08-16 18:07         ` Nikolai Kondrashov
2024-08-19  9:50       ` Nikolai Kondrashov
2024-08-26 10:37         ` Nikolai Kondrashov
2024-08-05 12:28 ` KCIDB: Test schema enhancements Nikolai Kondrashov
2024-08-06 17:39   ` [Automated-testing] " Nikolai Kondrashov
2024-08-06 19:26     ` Mark Brown
2024-08-07 10:46       ` Nikolai Kondrashov
2024-08-05 12:34 ` Nikolai Kondrashov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).