Array manipulation revisit

devicetree-compiler.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Array manipulation revisit
@ 2024-08-02 15:36 Hoover, Erich (Orion)
  2024-08-03  6:37 ` David Gibson
  0 siblings, 1 reply; 9+ messages in thread
From: Hoover, Erich (Orion) @ 2024-08-02 15:36 UTC (permalink / raw)
  To: devicetree-compiler@vger.kernel.org

I saw this old thread on array manipulation:
    https://www.spinics.net/lists/devicetree-compiler/msg02575.html
and was wondering if anything got implemented to handle arrays and I'm just not finding it.

For most cases I've encountered there are workarounds for dealing with the arrays, but I'd like to figure out a way to update GPIO line names from an overlay.  Example:

&gpio {
	gpio-line-names =
		"QSPI_CLK", /* GPIO0 (example, not always the same) */
		"QSPI_DQ1", /* GPIO1 (example, not always the same) */
		"QSPI_DQ2", /* GPIO2 (example, not always the same) */
		"QSPI_DQ3", /* GPIO3 (example, not always the same) */
		"QSPI_DQ0", /* GPIO4 (example, not always the same) */
		"QSPI_CS_B", /* GPIO5 (example, not always the same) */
		/* ... */
		"GPIO_MUX_SEL0", /* GPIO148 (thing to add) */
		"GPIO_MUX_SEL1", /* GPIO149 (thing to add) */
		/* ... */
	};
};

What I'm looking at is that I have several layers of device tree information:
    1) shared-based-device.dtb
    2) fpga-overlay.dtbo (different for every device)
    3) config-specific-overlay.dtbo

When all of this information is integrated into a single tree I can use #defines to set names for all the pins and then write out the pin information all at once (though this is a little painful, since it requires a #define for each pin).  However, when this information is in an overlay I need to know what the current contents are to replace things properly.  So, what I would _like_ to be able to do is something like this:

&gpio {
	gpio-line-names[148] = "GPIO_MUX_SEL0";
	gpio-line-names[149] = "GPIO_MUX_SEL1";
};

Then when the overlay is applied what I would expect to happen would be for libfdt to read the existing value, find and replace the specified indices, and then set the full property with the new value.  Is this totally crazy?  Is there already a better way to do this?

Best,
Erich E. Hoover, Ph.D.
erich.hoover@arcfield.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Array manipulation revisit
  2024-08-02 15:36 Array manipulation revisit Hoover, Erich (Orion)
@ 2024-08-03  6:37 ` David Gibson
  2024-08-05 16:35   ` Hoover, Erich (Orion)
  0 siblings, 1 reply; 9+ messages in thread
From: David Gibson @ 2024-08-03  6:37 UTC (permalink / raw)
  To: Hoover, Erich (Orion); +Cc: devicetree-compiler@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 4611 bytes --]

On Fri, Aug 02, 2024 at 03:36:43PM +0000, Hoover, Erich (Orion) wrote:
> I saw this old thread on array manipulation:
>     https://www.spinics.net/lists/devicetree-compiler/msg02575.html
> and was wondering if anything got implemented to handle arrays and I'm just not finding it.
> 
> For most cases I've encountered there are workarounds for dealing with the arrays, but I'd like to figure out a way to update GPIO line names from an overlay.  Example:
> 
> &gpio {
> 	gpio-line-names =
> 		"QSPI_CLK", /* GPIO0 (example, not always the same) */
> 		"QSPI_DQ1", /* GPIO1 (example, not always the same) */
> 		"QSPI_DQ2", /* GPIO2 (example, not always the same) */
> 		"QSPI_DQ3", /* GPIO3 (example, not always the same) */
> 		"QSPI_DQ0", /* GPIO4 (example, not always the same) */
> 		"QSPI_CS_B", /* GPIO5 (example, not always the same) */
> 		/* ... */
> 		"GPIO_MUX_SEL0", /* GPIO148 (thing to add) */
> 		"GPIO_MUX_SEL1", /* GPIO149 (thing to add) */
> 		/* ... */
> 	};
> };
> 
> What I'm looking at is that I have several layers of device tree information:
>     1) shared-based-device.dtb
>     2) fpga-overlay.dtbo (different for every device)
>     3) config-specific-overlay.dtbo
> 
> When all of this information is integrated into a single tree I can use #defines to set names for all the pins and then write out the pin information all at once (though this is a little painful, since it requires a #define for each pin).  However, when this information is in an overlay I need to know what the current contents are to replace things properly.  So, what I would _like_ to be able to do is something like this:
> 
> &gpio {
> 	gpio-line-names[148] = "GPIO_MUX_SEL0";
> 	gpio-line-names[149] = "GPIO_MUX_SEL1";
> };
> 
> Then when the overlay is applied what I would expect to happen would
> be for libfdt to read the existing value, find and replace the
> specified indices, and then set the full property with the new
> value.  Is this totally crazy?  Is there already a better way to do
> this?

Well, it's not crazy to want, but it's basically infeasible to
implement (within something resembling the current formats).

The heart of the problem is that "arrays" aren't actually a thing in
the dtb format: each property is simply a bytestring.  You can
structure things as lists in the dts for convenience, but it's all
just flattened into a bytestring in the output - the user of the dtb
has to know the expected format (specified in the binding) in order to
decipher it again.  For example:

	foo = "a", b";
	foo = "a\0b";
	foo = [61006200];
	foo = <0x61006200>;

Will all produce byte-for-byte identical output in the dtb.

In the case of an "array" of strings it's even worse: all the strings
(each with a terminating \0) are just appended together to form the
property bytestring, so the "entries" aren't of consistent length, you
have to scan the whole thing to find an entry of a given number.

dtb overlays are kind of a hack layered on top of dtb, and don't have
a way to represent things at a granularity less than a whole property
(except for special handling of phandles).

Introducing new syntax to dts always requires care to not break
compatibility, but that's really the least of the problems here.  To
implement this you'd need to invent a way of encoding partial property
updates into dtb, which would be an incompatible extension.  It would
need to cover ways of describing the part of the property to be
replaced more complex than fixed byte offsets to handle the string
list case here.  Finally you'd need to update libfdt (and any other
overlay implementation out there) to process this new fixup
information.

So, yeah, it's not going to happen.

However.  If you're able to have an actual program on the target
processing things, rather than just applying an overlay, then there
are more options.  You can then do whatever logic you need in your
program, using libfdt to read update the dtb.  I'm open to adding
additional helper functions if they're useful for things here.

Aside: you could argue that using a huge list of strings like this is
a poor design choice in the gpio binding, instead of, say, having each
gpio line as a different subnode using 'reg' to index them explicitly.
But, spec design tradeoffs are always tricky, and we're pretty much
stuck with it now, anyway.

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Array manipulation revisit
  2024-08-03  6:37 ` David Gibson
@ 2024-08-05 16:35   ` Hoover, Erich (Orion)
  2024-08-05 22:59     ` Rob Herring
  2024-08-06  2:19     ` David Gibson
  0 siblings, 2 replies; 9+ messages in thread
From: Hoover, Erich (Orion) @ 2024-08-05 16:35 UTC (permalink / raw)
  To: David Gibson; +Cc: devicetree-compiler@vger.kernel.org

> From: David Gibson
> Sent: Saturday, August 3, 2024 12:37 AM
> ...
> dtb overlays are kind of a hack layered on top of dtb, and don't have a way to represent things at a granularity less than a whole property (except for special handling of phandles).

I'm going to say that they seem to be a _very_ useful hack.

> Introducing new syntax to dts always requires care to not break compatibility, but that's really the least of the problems here.  To implement this you'd need to invent a way of encoding partial property updates into dtb, which would be an incompatible extension.  It would need to cover ways of describing the part of the property to be replaced more complex than fixed byte offsets to handle the string list case here.

Maybe I'm being naive, but I think that this syntax could be very simple.  It looks to me like the current implementation disallows properties with brackets in the name, so I would propose taking advantage of that to create an invalid property name:
    PROPERTY[S*INDEX] = /bits/ BITS VALUE;
where S (single byte element size) is one of 1,2,4,8 or s (string array).  I could see "S*" being filled in automatically from the RHS data type (but override-able if you need to do something special).  So, in our string array case that means:
    PROPERTY[INDEX] = "string";
becomes a property named "PROPERTY[s*INDEX]" with the value "string" when stored in the dtbo.

> It would need to cover ways of describing the part of the property to be replaced more complex than fixed byte offsets to handle the string list case here.

It seems like string arrays are the only complicated case, and by knowing that it's a string array (like with the above "s" size byte) the implementation can count NULL bytes to find the correct element to replace.

> Finally you'd need to update libfdt (and any other overlay implementation out there) to process this new fixup information.
> 
> So, yeah, it's not going to happen.

Maybe I'm missing something, but if we're adding the information to the property name in a way that's currently invalid then does that solve this problem?  For older utilities, or ones without the handling for this extra information, the utility generates an error or just adds a strangely named property that gets ignored (when I gave this a quick try by hex editing a DTBO, the kernel happily gave me a "gpio-line-names[s*148]").

> However.  If you're able to have an actual program on the target processing things, rather than just applying an overlay, then there are more options.  You can then do whatever logic you need in your program, using libfdt to read update the dtb.  I'm open to adding additional helper functions if they're useful for things here.

For my case I am already using dtmerge on the target to produce the final dtb, so I can easily add a hack to dtmerge to manage this process.  That said, I'd like to avoid making something "completely" non-standard.  My preference would be to implement the feature in the official tools (in a way that's acceptable to the broader community) and then hack our tools to work with the same thing.

> Aside: you could argue that using a huge list of strings like this is a poor design choice in the gpio binding, instead of, say, having each gpio line as a different subnode using 'reg' to index them explicitly.  But, spec design tradeoffs are always tricky, and we're pretty much stuck with it now, anyway.
> ...

Yeah, I understand why they did it the way they did - but it definitely makes this particular task harder.  Some of our systems are incredibly storage-constrained, otherwise I would just throw the device tree headers and a compiler on the system and use a #define for each pin.

Best,
Erich E. Hoover, Ph.D.
erich.hoover@arcfield.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Array manipulation revisit
  2024-08-05 16:35   ` Hoover, Erich (Orion)
@ 2024-08-05 22:59     ` Rob Herring
  2024-08-06  2:30       ` David Gibson
  2024-08-06  2:19     ` David Gibson
  1 sibling, 1 reply; 9+ messages in thread
From: Rob Herring @ 2024-08-05 22:59 UTC (permalink / raw)
  To: Hoover, Erich (Orion); +Cc: David Gibson, devicetree-compiler@vger.kernel.org

On Mon, Aug 05, 2024 at 04:35:14PM +0000, Hoover, Erich (Orion) wrote:
> > From: David Gibson
> > Sent: Saturday, August 3, 2024 12:37 AM
> > ...

Please wrap quoted lines...

> > dtb overlays are kind of a hack layered on top of dtb, and don't 
> > have a way to represent things at a granularity less than a whole 
> > property (except for special handling of phandles).
> 
> I'm going to say that they seem to be a _very_ useful hack.
> 
> > Introducing new syntax to dts always requires care to not break 
> > compatibility, but that's really the least of the problems here.  
> > To implement this you'd need to invent a way of encoding partial 
> > property updates into dtb, which would be an incompatible 
> > extension.  It would need to cover ways of describing the part of 
> > the property to be replaced more complex than fixed byte offsets to 
> > handle the string list case here.
> 
> Maybe I'm being naive, but I think that this syntax could be very 
> simple.  It looks to me like the current implementation disallows 
> properties with brackets in the name, so I would propose taking 
> advantage of that to create an invalid property name:
>     PROPERTY[S*INDEX] = /bits/ BITS VALUE;
> where S (single byte element size) is one of 1,2,4,8 or s (string 
> array).  I could see "S*" being filled in automatically from the RHS 
> data type (but override-able if you need to do something special).  
> So, in our string array case that means:
>     PROPERTY[INDEX] = "string";
> becomes a property named "PROPERTY[s*INDEX]" with the value "string" 
> when stored in the dtbo.

The above doesn't work because existing software can't parse it if it 
is invalid.

This just continues the hack that is overlays. The hack is putting 
information which should be in the format itself (DTB) into the contents 
instead. The lack of type information is a major root issue. That's why 
__fixups__ and __local_fixups__ nodes exist. They store phandle 
locations with properties.

I think the major mistake we made with overlays was trying to preserve 
DTB format. That's ended up being pretty pointless because we have to 
update software applying overlays anyways. So why not just make the 
updated s/w understand a new DTB format. We could have required applying 
an overlay required a DTB in the new format (you need a rebuilt DTB 
anyways to add the symbols to it). Backwards compatibility could have 
been maintained by outputing the old format before passing to the OS. 

The other issue is when a new DTB format is proposed, it becomes a 
laundry list[1] of changes that goes nowhere.

What I think we should do is first rev the DTB format to allow for 
unknown tags so we can add additional, optional data which software can 
safely ignore. That would be an incompatible change. After that, we can 
add new tags in a backwards compatible way. Roughly what's needed is the 
same information stored in dtc's "marker" data: types and offsets.

> > It would need to cover ways of describing the part of the property 
> > to be replaced more complex than fixed byte offsets to handle the 
> > string list case here.
> 
> It seems like string arrays are the only complicated case, and by 
> knowing that it's a string array (like with the above "s" size byte) 
> the implementation can count NULL bytes to find the correct element to replace.

prop = "foo\0bar", "baz";

How many strings in "prop"? :)

Strings are actually the easy case IME as the heuristics to determine 
if a property is a string works pretty well (minus the above which is a 
test case in dtc, but not something I've seen in practice). It's 
determining the size of integer arrays I have problems with (in context 
of schema validation).

> > Finally you'd need to update libfdt (and any other overlay 
> > implementation out there) to process this new fixup information.
> > 
> > So, yeah, it's not going to happen.

Why not? Yeah, updating stuff is painful, but so is not fixing some of 
these problems. Are we just going to live with them forever? I think the 
reality is most things use libfdt (probably everything that cares about 
overlays) and they update libfdt at some frequency. And how old of a 
component do we really need to worry about compatibility with? 

This reminds me of the saying about planting trees. When is the best 
time to break compatibility? 15 years ago. When is the 2nd best time? 
Now. 

Rob

[1] https://elinux.org/New_FDT_format

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Array manipulation revisit
  2024-08-05 16:35   ` Hoover, Erich (Orion)
  2024-08-05 22:59     ` Rob Herring
@ 2024-08-06  2:19     ` David Gibson
  2024-08-06 16:29       ` [EXTERNAL] " Hoover, Erich (Orion)
  1 sibling, 1 reply; 9+ messages in thread
From: David Gibson @ 2024-08-06  2:19 UTC (permalink / raw)
  To: Hoover, Erich (Orion); +Cc: devicetree-compiler@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 4994 bytes --]

On Mon, Aug 05, 2024 at 04:35:14PM +0000, Hoover, Erich (Orion) wrote:
> > From: David Gibson
> > Sent: Saturday, August 3, 2024 12:37 AM
> > ...
> > dtb overlays are kind of a hack layered on top of dtb, and don't have a way to represent things at a granularity less than a whole property (except for special handling of phandles).
> 
> I'm going to say that they seem to be a _very_ useful hack.

I won't disagree.

> > Introducing new syntax to dts always requires care to not break compatibility, but that's really the least of the problems here.  To implement this you'd need to invent a way of encoding partial property updates into dtb, which would be an incompatible extension.  It would need to cover ways of describing the part of the property to be replaced more complex than fixed byte offsets to handle the string list case here.
> 
> Maybe I'm being naive, but I think that this syntax could be very simple.  It looks to me like the current implementation disallows properties with brackets in the name, so I would propose taking advantage of that to create an invalid property name:
>     PROPERTY[S*INDEX] = /bits/ BITS VALUE;
> where S (single byte element size) is one of 1,2,4,8 or s (string array).  I could see "S*" being filled in automatically from the RHS data type (but override-able if you need to do something special).  So, in our string array case that means:
>     PROPERTY[INDEX] = "string";
> becomes a property named "PROPERTY[s*INDEX]" with the value "string" when stored in the dtbo.

Eh, maybe, but as I say that's the least of the problems.

> > It would need to cover ways of describing the part of the property to be replaced more complex than fixed byte offsets to handle the string list case here.
> 
> It seems like string arrays are the only complicated case,

Not really.  Mixed properties that have both integers and strings in
them get even worse.  Those aren't common, but they exist.  Or
properties that have a mixture of integer sizes - that's much more
common, since it can include 'reg', 'ranges' and 'interrupts'.

> and by
> knowing that it's a string array (like with the above "s" size byte)
> the implementation can count NULL bytes to find the correct element
> to replace.
> 
> > Finally you'd need to update libfdt (and any other overlay implementation out there) to process this new fixup information.
> > 
> > So, yeah, it's not going to happen.
> 
> Maybe I'm missing something, but if we're adding the information to
> the property name in a way that's currently invalid then does that
> solve this problem?

Ah, so you're suggesting putting those annotations in the property
name itself inside the dtbo.  That's not a bad idea as encodings go.

> For older utilities, or ones without the
> handling for this extra information, the utility generates an error
> or just adds a strangely named property that gets ignored (when I
> gave this a quick try by hex editing a DTBO, the kernel happily gave
> me a "gpio-line-names[s*148]").

Right, but then you silently get output that wasn't what you expected
which can be nasty to debug.

> > However.  If you're able to have an actual program on the target
> > processing things, rather than just applying an overlay, then
> > there are more options.  You can then do whatever logic you need
> > in your program, using libfdt to read update the dtb.  I'm open to
> > adding additional helper functions if they're useful for things
> > here.
> 
> For my case I am already using dtmerge on the target to produce the
> final dtb, so I can easily add a hack to dtmerge to manage this
> process.  That said, I'd like to avoid making something "completely"
> non-standard.  My preference would be to implement the feature in
> the official tools (in a way that's acceptable to the broader
> community) and then hack our tools to work with the same thing.
> 
> > Aside: you could argue that using a huge list of strings like this
> > is a poor design choice in the gpio binding, instead of, say,
> > having each gpio line as a different subnode using 'reg' to index
> > them explicitly.  But, spec design tradeoffs are always tricky,
> > and we're pretty much stuck with it now, anyway.  ...
> 
> Yeah, I understand why they did it the way they did - but it
> definitely makes this particular task harder.  Some of our systems
> are incredibly storage-constrained,

Well, that raises another issue: this would add a substantial amount
of code to libfdt, which is supposed to be runnable in very
constrained environments.

> otherwise I would just throw the
> device tree headers and a compiler on the system and use a #define
> for each pin.
> 
> Best,
> Erich E. Hoover, Ph.D.
> erich.hoover@arcfield.com

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Array manipulation revisit
  2024-08-05 22:59     ` Rob Herring
@ 2024-08-06  2:30       ` David Gibson
  0 siblings, 0 replies; 9+ messages in thread
From: David Gibson @ 2024-08-06  2:30 UTC (permalink / raw)
  To: Rob Herring; +Cc: Hoover, Erich (Orion), devicetree-compiler@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 6302 bytes --]

On Mon, Aug 05, 2024 at 04:59:35PM -0600, Rob Herring wrote:
> On Mon, Aug 05, 2024 at 04:35:14PM +0000, Hoover, Erich (Orion) wrote:
> > > From: David Gibson
> > > Sent: Saturday, August 3, 2024 12:37 AM
> > > ...
> 
> Please wrap quoted lines...
> 
> > > dtb overlays are kind of a hack layered on top of dtb, and don't 
> > > have a way to represent things at a granularity less than a whole 
> > > property (except for special handling of phandles).
> > 
> > I'm going to say that they seem to be a _very_ useful hack.
> > 
> > > Introducing new syntax to dts always requires care to not break 
> > > compatibility, but that's really the least of the problems here.  
> > > To implement this you'd need to invent a way of encoding partial 
> > > property updates into dtb, which would be an incompatible 
> > > extension.  It would need to cover ways of describing the part of 
> > > the property to be replaced more complex than fixed byte offsets to 
> > > handle the string list case here.
> > 
> > Maybe I'm being naive, but I think that this syntax could be very 
> > simple.  It looks to me like the current implementation disallows 
> > properties with brackets in the name, so I would propose taking 
> > advantage of that to create an invalid property name:
> >     PROPERTY[S*INDEX] = /bits/ BITS VALUE;
> > where S (single byte element size) is one of 1,2,4,8 or s (string 
> > array).  I could see "S*" being filled in automatically from the RHS 
> > data type (but override-able if you need to do something special).  
> > So, in our string array case that means:
> >     PROPERTY[INDEX] = "string";
> > becomes a property named "PROPERTY[s*INDEX]" with the value "string" 
> > when stored in the dtbo.
> 
> The above doesn't work because existing software can't parse it if it 
> is invalid.
> 
> This just continues the hack that is overlays. The hack is putting 
> information which should be in the format itself (DTB) into the contents 
> instead. The lack of type information is a major root issue. That's why 
> __fixups__ and __local_fixups__ nodes exist. They store phandle 
> locations with properties.
> 
> I think the major mistake we made with overlays was trying to preserve 
> DTB format. That's ended up being pretty pointless because we have to 
> update software applying overlays anyways. So why not just make the 
> updated s/w understand a new DTB format. We could have required applying 
> an overlay required a DTB in the new format (you need a rebuilt DTB 
> anyways to add the symbols to it). Backwards compatibility could have 
> been maintained by outputing the old format before passing to the OS. 

I tend to agree.

> The other issue is when a new DTB format is proposed, it becomes a 
> laundry list[1] of changes that goes nowhere.

Right.  Note there are also two layers even here: there's the abstract
format of the underlying device tree itself (a tree of nodes with
bytestring properties, as defined by IEEE1275), then there's the
linearized dtb format.

> What I think we should do is first rev the DTB format to allow for 
> unknown tags so we can add additional, optional data which software can 
> safely ignore. That would be an incompatible change. After that, we can 
> add new tags in a backwards compatible way. Roughly what's needed is the 
> same information stored in dtc's "marker" data: types and offsets.

You could do this, but it has its own difficulties.  It's adding
metadata in the dtb that's not present in the underlying device tree.
That's already a bit risky in terms of things relying on it at stages
they shouldn't - people already too often fail to realize that "a\0b"
and <0x61006200> are indistinguishable once you're in the kernel.
Making it so you _can_ distinguish them right up until the very last
stage seems likely to make that worse.

You could consider replacing the whole - admittendly pretty legacy -
OF1275 style information with something more modern and
self-describing.  But of course that's an even more widespread
overhaul.  And it's not obvious what you'd use: json (or one of the
various isomorphic formats) seems the obvious choice, except that json
can't naturally and safely encode 64-bit integers, which is a pretty
fatal flaw for this use case.

> > > It would need to cover ways of describing the part of the property 
> > > to be replaced more complex than fixed byte offsets to handle the 
> > > string list case here.
> > 
> > It seems like string arrays are the only complicated case, and by 
> > knowing that it's a string array (like with the above "s" size byte) 
> > the implementation can count NULL bytes to find the correct element to replace.
> 
> prop = "foo\0bar", "baz";
> 
> How many strings in "prop"? :)
> 
> Strings are actually the easy case IME as the heuristics to determine 
> if a property is a string works pretty well (minus the above which is a 
> test case in dtc, but not something I've seen in practice). It's 
> determining the size of integer arrays I have problems with (in context 
> of schema validation).
> 
> > > Finally you'd need to update libfdt (and any other overlay 
> > > implementation out there) to process this new fixup information.
> > > 
> > > So, yeah, it's not going to happen.
> 
> Why not? Yeah, updating stuff is painful, but so is not fixing some of 
> these problems. Are we just going to live with them forever? I think the 
> reality is most things use libfdt (probably everything that cares about 
> overlays) and they update libfdt at some frequency. And how old of a 
> component do we really need to worry about compatibility with?

I was meaning in the context of retaining a compatible dtb format.
Going outside that opens new possibilities.  But, that's a big job and
not one I'm going to take on.

> This reminds me of the saying about planting trees. When is the best 
> time to break compatibility? 15 years ago. When is the 2nd best time? 
> Now. 
> 
> Rob
> 
> [1] https://elinux.org/New_FDT_format
> 

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [EXTERNAL] Re: Array manipulation revisit
  2024-08-06  2:19     ` David Gibson
@ 2024-08-06 16:29       ` Hoover, Erich (Orion)
  2024-08-20 20:06         ` Hoover, Erich (Orion)
  2024-08-21  3:44         ` David Gibson
  0 siblings, 2 replies; 9+ messages in thread
From: Hoover, Erich (Orion) @ 2024-08-06 16:29 UTC (permalink / raw)
  To: David Gibson; +Cc: devicetree-compiler@vger.kernel.org

> From: David Gibson
> Sent: Monday, August 5, 2024 8:19 PM
> ...
> > It seems like string arrays are the only complicated case,
> 
> Not really.  Mixed properties that have both integers and strings in
> them get even worse.

I was not thinking of this case, but if you want to handle this case in an initial implementation I would suggest:
   PROPERTY[1*5+s*1+4*0] = /bits/ 32 0x0;
(skip five bytes, skip a string, replace first 32-bit array element)

> Those aren't common, but they exist.  Or
> properties that have a mixture of integer sizes - that's much more
> common, since it can include 'reg', 'ranges' and 'interrupts'.

You could do something similar to the above or, for simplicity, just count bytes:
   PROPERTY[1*5] = /bits/ 32 0x0;
(replace 32-bit value starting at fifth byte)

> ...
> Ah, so you're suggesting putting those annotations in the property
> name itself inside the dtbo.  That's not a bad idea as encodings go.

Yes, in this case I'm thinking that the dtbo needs to be handled by libfdt
(or a similar tool) before being passed to the kernel.  Though the kernel
could conceivably support something like this directly.

> > For older utilities, or ones without the
> > handling for this extra information, the utility generates an error
> > or just adds a strangely named property that gets ignored (when I
> > gave this a quick try by hex editing a DTBO, the kernel happily gave
> > me a "gpio-line-names[s*148]").
> 
> Right, but then you silently get output that wasn't what you expected
> which can be nasty to debug.

Well, libfdt does the right thing and generates an error.  I would imagine
that the version flag (or something similar) could be used to anger the
kernel rather than creating strangely named properties.

> ...
> > Yeah, I understand why they did it the way they did - but it
> > definitely makes this particular task harder.  Some of our systems
> > are incredibly storage-constrained,
>
> Well, that raises another issue: this would add a substantial amount
> of code to libfdt, which is supposed to be runnable in very
> constrained environments.
> ...

I did a quick prototype of the string case in dtmerge, for convenience, and it doesn't seem too bad to me:
===
        dtoverlay_debug("  +prop(%s)", prop_name);
        prop_name_len = strlen(prop_name);
        array = memchr(prop_name, '[', prop_name_len);
        array_end = memchr(prop_name, ']', prop_name_len);
        if (array && array_end && array_end - array >= 4 && array[2] == '*')
        {
            if (array[1] != 's')
            {
                dtoverlay_error("  unhandled overlay array type: %s", prop_name);
                err = -FDT_ERR_BADSTRUCTURE;
                break;
            }
            prop_name_local = malloc(prop_name_len);
            strncpy(prop_name_local, prop_name, (array - prop_name));
            prop_name = prop_name_local;
        }
        else if (array && array_end)
        {
            dtoverlay_error("  invalid overlay array: %s", prop_name);
            err = -FDT_ERR_BADSTRUCTURE;
            break;
        }

        if (array && array_end &&
            ((target_prop = fdt_get_property_w(base_dtb->fdt, target_off, prop_name, &target_len)) != NULL) &&
            (target_len > 0))
        {
            int before_len, after_len;
            char *old_val, *new_val;
            int array_element;
            const char *p;

            array_element = strtoul(&array[3], NULL, 10);
            old_val = target_prop->data;
            new_val = malloc(prop_len + target_len);
            p = fdt_stringlist_get(base_dtb->fdt, target_off, prop_name, array_element, NULL);
            if (!p)
            {
                err = -FDT_ERR_NOTFOUND;
                break;
            }
            before_len = p - old_val;
            p = fdt_stringlist_get(base_dtb->fdt, target_off, prop_name, array_element + 1, NULL);
            after_len = (p ? target_len - (p - old_val) : 0);
            memcpy(&new_val[0], old_val, before_len);
            memcpy(&new_val[before_len], prop_val, prop_len);
            memcpy(&new_val[before_len+prop_len], p, after_len);
            err = fdt_setprop(base_dtb->fdt, target_off, prop_name, new_val, before_len + after_len + prop_len);
            free(new_val);
        }
        else if ...

        free(prop_name_local);
===

Do you consider something like this to be too much?  I feel like adding support for the integer array cases is pretty trivial, but if you're wanting to support mixed property types then the error checking on that would get pretty complicated.

Best,
Erich E. Hoover, Ph.D.
erich.hoover@arcfield.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [EXTERNAL] Re: Array manipulation revisit
  2024-08-06 16:29       ` [EXTERNAL] " Hoover, Erich (Orion)
@ 2024-08-20 20:06         ` Hoover, Erich (Orion)
  2024-08-21  3:44         ` David Gibson
  1 sibling, 0 replies; 9+ messages in thread
From: Hoover, Erich (Orion) @ 2024-08-20 20:06 UTC (permalink / raw)
  To: David Gibson; +Cc: devicetree-compiler@vger.kernel.org

> From: Hoover, Erich (Orion) <erich.hoover@arcfield.com>
> Sent: Tuesday, August 6, 2024 10:29 AM
> ...
> I did a quick prototype of the string case in dtmerge, for convenience, and it doesn't seem too bad to me:
> ...
> ... I feel like adding support for the integer array cases is pretty trivial, but if you're wanting to support mixed property types then the error checking on that would get pretty complicated.
> ...

If I put something similar together for libfdt would that be worth the effort?  If you like the idea then I'm happy to prototype the approach and submit something for your feedback.  If so, then would you like it to support mixed property types or just strings and various integer sizes?

Best,
Erich E. Hoover, Ph.D.
erich.hoover@arcfield.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [EXTERNAL] Re: Array manipulation revisit
  2024-08-06 16:29       ` [EXTERNAL] " Hoover, Erich (Orion)
  2024-08-20 20:06         ` Hoover, Erich (Orion)
@ 2024-08-21  3:44         ` David Gibson
  1 sibling, 0 replies; 9+ messages in thread
From: David Gibson @ 2024-08-21  3:44 UTC (permalink / raw)
  To: Hoover, Erich (Orion); +Cc: devicetree-compiler@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 6665 bytes --]

On Tue, Aug 06, 2024 at 04:29:45PM +0000, Hoover, Erich (Orion) wrote:
> > From: David Gibson
> > Sent: Monday, August 5, 2024 8:19 PM
> > ...
> > > It seems like string arrays are the only complicated case,
> > 
> > Not really.  Mixed properties that have both integers and strings in
> > them get even worse.
> 
> I was not thinking of this case, but if you want to handle this case in an initial implementation I would suggest:
>    PROPERTY[1*5+s*1+4*0] = /bits/ 32 0x0;
> (skip five bytes, skip a string, replace first 32-bit array element)

Already a substantial increase in the complexity of the overlay-time
parser.

Now.. the case where the number of integers before the strings isn't
know at overlay creation time, but comes from a separate property..

My point here isn't so much that these are super useful cases of
themselves.  But just as the string array replacement seemed an easy
and obvious extension to you, there's a whole series of "easy and
obvious" additional steps beyond that lead to something of horrific
complexity.  So far I'm not seeing an obvious boundary to set to limit
the scope of this.

> > Those aren't common, but they exist.  Or
> > properties that have a mixture of integer sizes - that's much more
> > common, since it can include 'reg', 'ranges' and 'interrupts'.
> 
> You could do something similar to the above or, for simplicity, just count bytes:
>    PROPERTY[1*5] = /bits/ 32 0x0;
> (replace 32-bit value starting at fifth byte)

*If* you know the byte offset when you're creating the overlay, which
is not always the case.

> > ...
> > Ah, so you're suggesting putting those annotations in the property
> > name itself inside the dtbo.  That's not a bad idea as encodings go.
> 
> Yes, in this case I'm thinking that the dtbo needs to be handled by libfdt
> (or a similar tool) before being passed to the kernel.  Though the kernel
> could conceivably support something like this directly.
> 
> > > For older utilities, or ones without the
> > > handling for this extra information, the utility generates an error
> > > or just adds a strangely named property that gets ignored (when I
> > > gave this a quick try by hex editing a DTBO, the kernel happily gave
> > > me a "gpio-line-names[s*148]").
> > 
> > Right, but then you silently get output that wasn't what you expected
> > which can be nasty to debug.
> 
> Well, libfdt does the right thing and generates an error.

Huh.  I didn't think we validated property names during overlay
application.  What error exactly is it?

> I would imagine
> that the version flag (or something similar) could be used to anger the
> kernel rather than creating strangely named properties.

That's harder than you'd think.  The version number is a dtb
construct, and as noted dtbo is a hack on top of that.  There's (alas)
no version number covering how overlay things are encoded in the dtb,
only for the low-level dtb encoding of properties and nodes.

> > ...
> > > Yeah, I understand why they did it the way they did - but it
> > > definitely makes this particular task harder.  Some of our systems
> > > are incredibly storage-constrained,
> >
> > Well, that raises another issue: this would add a substantial amount
> > of code to libfdt, which is supposed to be runnable in very
> > constrained environments.
> > ...
> 
> I did a quick prototype of the string case in dtmerge, for convenience, and it doesn't seem too bad to me:
> ===
>         dtoverlay_debug("  +prop(%s)", prop_name);
>         prop_name_len = strlen(prop_name);
>         array = memchr(prop_name, '[', prop_name_len);
>         array_end = memchr(prop_name, ']', prop_name_len);
>         if (array && array_end && array_end - array >= 4 && array[2] == '*')
>         {
>             if (array[1] != 's')
>             {
>                 dtoverlay_error("  unhandled overlay array type: %s", prop_name);
>                 err = -FDT_ERR_BADSTRUCTURE;
>                 break;
>             }
>             prop_name_local = malloc(prop_name_len);

Note that libfdt is designed to work in very constrained environments
and is not permitted to use an allocator (that's why there are all
those _namelen() variants).

>             strncpy(prop_name_local, prop_name, (array - prop_name));
>             prop_name = prop_name_local;
>         }
>         else if (array && array_end)
>         {
>             dtoverlay_error("  invalid overlay array: %s", prop_name);
>             err = -FDT_ERR_BADSTRUCTURE;

Hrm, that makes me think of another problem.  The more complex the
syntax allowed here, the more inadequate simple error return codes
become to communicate what's wrong, which is all we have in libfdt.

>             break;
>         }
> 
>         if (array && array_end &&
>             ((target_prop = fdt_get_property_w(base_dtb->fdt, target_off, prop_name, &target_len)) != NULL) &&
>             (target_len > 0))
>         {
>             int before_len, after_len;
>             char *old_val, *new_val;
>             int array_element;
>             const char *p;
> 
>             array_element = strtoul(&array[3], NULL, 10);
>             old_val = target_prop->data;
>             new_val = malloc(prop_len + target_len);
>             p = fdt_stringlist_get(base_dtb->fdt, target_off, prop_name, array_element, NULL);
>             if (!p)
>             {
>                 err = -FDT_ERR_NOTFOUND;
>                 break;
>             }
>             before_len = p - old_val;
>             p = fdt_stringlist_get(base_dtb->fdt, target_off, prop_name, array_element + 1, NULL);
>             after_len = (p ? target_len - (p - old_val) : 0);
>             memcpy(&new_val[0], old_val, before_len);
>             memcpy(&new_val[before_len], prop_val, prop_len);
>             memcpy(&new_val[before_len+prop_len], p, after_len);
>             err = fdt_setprop(base_dtb->fdt, target_off, prop_name, new_val, before_len + after_len + prop_len);
>             free(new_val);
>         }
>         else if ...
> 
>         free(prop_name_local);
> ===
> 
> Do you consider something like this to be too much?  I feel like
> adding support for the integer array cases is pretty trivial, but if
> you're wanting to support mixed property types then the error
> checking on that would get pretty complicated.

Yes, it would :/.

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-08-21  4:02 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-02 15:36 Array manipulation revisit Hoover, Erich (Orion)
2024-08-03  6:37 ` David Gibson
2024-08-05 16:35   ` Hoover, Erich (Orion)
2024-08-05 22:59     ` Rob Herring
2024-08-06  2:30       ` David Gibson
2024-08-06  2:19     ` David Gibson
2024-08-06 16:29       ` [EXTERNAL] " Hoover, Erich (Orion)
2024-08-20 20:06         ` Hoover, Erich (Orion)
2024-08-21  3:44         ` David Gibson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).