* Some slightly random musings on device tree expression syntax @ 2012-03-08 0:40 Stephen Warren [not found] ` <4F580005.403-3lzwWm7+Weoh9ZMKESR00Q@public.gmane.org> 0 siblings, 1 reply; 6+ messages in thread From: Stephen Warren @ 2012-03-08 0:40 UTC (permalink / raw) To: david-xT8FGy+AXnRB3Ne2BGzF6laj5H9X9Tb+, jdl-CYoMK+44s/E Cc: devicetree-discuss I was thinking some more about how to expand the device tree syntax to allow expressions. I wondered if we should use a concept/syntax more inspired by template processors. Playing with jinja2 and gpp led me towards (...) being an inline expression syntax that can calculate integers or strings and get replaced by the string representation of the expression, and ! at the start of a line introducing a statement context. So, below are my somewhat wandering thoughts on the matter. However, the idea still raises a lot of questions that'd need to be resolved. I note a few things: * Using the (...) syntax to indicate which parts of the file should be evaluated and the substituted solves the issue that David had with Jon's proposal re: how do you know when a node name is literal text vs. concatenated to some expression. * Separating the device tree syntax and pre-processor/... phase allows them to be decoupled and the pre-processor potentially optional, or even replaced if things don't work out, or different people could use their own thing. * As an aside, I wonder if we couldn't transparently allow <1 2 3> or <1, 2, 3> for cell list syntax, thus not requiring the brackets in previously proposed <(1 + 0) (1 + 1) (4 - 1)> syntax, but rather <1 + 0, 1 + 1, 4 - 1>? Concept ======================================== The .dts syntax that dtc reads is unchanged. A pre-processing phase occurs on .dts files that handles all aspects of expressions; all definitions, macro processing, expression process, etc. are evaluated and fully expanded to strings during the pre-processing phase. The result of the pre-processing phase should be a source file or stream that can be handled by the existing dtc. Whether this pre-processing phase is implemented as: * A separate executable, manually invoked by the user. * A separate executable, automatically invoked by dtc itself. * Something built into dtc itself. ... is not addressed by this proposal. One potential issue here: if the pre-processing and regular compilation phases are completely separate, do we need to pay attention that the int, literal, byte-sequence literal syntax stays the same between the two phases to reduce confusion, or not? Pre-processing ======================================== Contexts: Pass-through: By default, the pass-through context is active. Data is passed from input to output without modification, except that data is searched for markers that begin other contexts. Expression: Introduced by: ( Terminated by: a matching ) The text within this context is interpreted as an expression. That expression is evaluated, the result formatting as a string, and that string written to the output stream in place of the ( ) markers and the expression between them. Expression context can being anywhere within the source stream; no note is taken of the tokens that the device tree language Statement: Introduced by: ! Notes: Or some other suitable character; # conflicts with property names unless we require it to be in the first column, and also sounds too much much like regular cpp, so people might get confused. @ might work. This is probably bike-shedding at this point... Terminated by: End of line Example: Note: // comments are used below as comments in this document, not necessarily comments in the actual proposed syntax. // Simple constant definitions // Syntax of RHS matches existing .dts syntax !defint usbbase 0x6000000 !defint usbsize 0x100 !defint usbstride 0x1000 !defstr usb "usb" !defbytes somebytes [de ad be ef] // or perhaps implicitly set variable type based on type of the RHS? !define usbbase 0x6000000 !define usb "usb" or !assign or !let ... // RHS may also use expression syntax // and references to previously defined variables !defint usb3base usbbase + (2 * usbstride) !defstr catenated usb + "2" // Simple use of some variables: (usbbase) (usbsize) (catenated) // which yields: // 0x6000000 0x100 usb2 // A more complex example: (usb)3@(usb3base) { reg = <(usb3base) (usbsize)>; name = "(usb)3"; }; // which yields: // usb3@0x60002000 { // reg = <0x60002000 0x100>; // name = "usb3"; // }; // Question: Do ints always format as 0x%x since that's the most common, // or do we need explicit control over the base etc.? // // Question: How do we know when to format strings with "" around them, // e.g. for use as property values, and when not to, e.g. for use in // arbitrary contexts? For example above, it'd be nice if when defining // the name property, we could write 'name = usb3name;' and have it // expand to 'name = "usb3";' given a str variable with value "usb3", // yet we don't want the quotes when using variable usb in the node // name in the example earlier. // // Question: What if we actually wanted the property value "(usb3)". How // do we stop the expansion; how to escape? // // I suppose the solution for the latter 2 questions is that the // expansion has to actually be sensitive to context in the underlying // language, and include "" in property value context, but not // elsewhere. But, what if you write: !defstr nasty "usb@0x6000000 { name ="; (nasty) (foo); // Additional statements could include if, for, while, ...: !ifdef somevar foo bar !else baz qux !endif // I think we don't need e.g.: foo !ifdef somevar! bar !else! baz !endif! qux // ... since I think that we can line-break in the middle of any // property or node definition, so we could just do this instead: foo !ifdef somevar bar !else baz !endif qux // If we need to actually concatenate the strings into one, we can do // that as an expression somehow, assign the result to a variable, and // expand just that. !defstr xxx "foo" !ifdef somevar !defstr xxx xxx + "bar" !else !defstr xxx xxx + "baz" !endif !defstr xxx xxx + "qux" (xxx) // Perhaps we can delimit large blocks of statements in a way that // doesn't need a lot of !s: !! xxx = "foo" if somevar: xxx += "bar" else: xxx += "baz" xxx += "qux" !! (xxx) // Then, we can start allowing complex things like macro or function // definitions within the !! block; a full regular language, and // perhaps we could even borrow an existing one here. // About functions: Perhaps cpp-style macros: !define func(a, b, c) a + b + c // where the RHS is an expression that can use variables in the // parameter list // // Or, is the RHS/body raw text, so something more like: !macro func(a, b, c) foo { prop = (a + b + c); } !end // Perhaps we need both; one with text RHS accepting escapes into // expressions, one with an expression on the RHS. // I wondered if !define's RHS should always be an expression, or // instead always be raw text with the same (...) escape to expressions // as in regular text: // (assuming a, b, c are extant variables) // all variables are strings? !define foo (a) + (b) + (c) or: // yields an integer variable !defint foo a + b + c ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <4F580005.403-3lzwWm7+Weoh9ZMKESR00Q@public.gmane.org>]
* Re: Some slightly random musings on device tree expression syntax [not found] ` <4F580005.403-3lzwWm7+Weoh9ZMKESR00Q@public.gmane.org> @ 2012-03-12 13:53 ` Jon Loeliger [not found] ` <E1S75gQ-0005WK-0D-CYoMK+44s/E@public.gmane.org> 2012-03-13 4:46 ` David Gibson 1 sibling, 1 reply; 6+ messages in thread From: Jon Loeliger @ 2012-03-12 13:53 UTC (permalink / raw) To: Stephen Warren; +Cc: devicetree-discuss > I was thinking some more about how to expand the device tree syntax to > allow expressions. Excellent! > I wondered if we should use a concept/syntax more > inspired by template processors. Playing with jinja2 and gpp led me > towards (...) being an inline expression syntax that can calculate > integers or strings and get replaced by the string representation of the > expression, and ! at the start of a line introducing a statement > context. So, below are my somewhat wandering thoughts on the matter. > However, the idea still raises a lot of questions that'd need to be > resolved. > > I note a few things: > > * Using the (...) syntax to indicate which parts of the file should be > evaluated and the substituted solves the issue that David had with Jon's > proposal re: how do you know when a node name is literal text vs. > concatenated to some expression. So the M4 solution then. > * As an aside, I wonder if we couldn't transparently allow <1 2 3> or > <1, 2, 3> for cell list syntax, thus not requiring the brackets in > previously proposed <(1 + 0) (1 + 1) (4 - 1)> syntax, but rather <1 + 0, > 1 + 1, 4 - 1>? That's the sort of direction I advocated earlier. jdl ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <E1S75gQ-0005WK-0D-CYoMK+44s/E@public.gmane.org>]
* Re: Some slightly random musings on device tree expression syntax [not found] ` <E1S75gQ-0005WK-0D-CYoMK+44s/E@public.gmane.org> @ 2012-03-12 23:57 ` David Gibson 0 siblings, 0 replies; 6+ messages in thread From: David Gibson @ 2012-03-12 23:57 UTC (permalink / raw) To: Jon Loeliger; +Cc: devicetree-discuss On Mon, Mar 12, 2012 at 08:53:05AM -0500, Jon Loeliger wrote: > > I was thinking some more about how to expand the device tree syntax to > > allow expressions. > > Excellent! > > > I wondered if we should use a concept/syntax more > > inspired by template processors. Playing with jinja2 and gpp led me > > towards (...) being an inline expression syntax that can calculate > > integers or strings and get replaced by the string representation of the > > expression, and ! at the start of a line introducing a statement > > context. So, below are my somewhat wandering thoughts on the matter. > > However, the idea still raises a lot of questions that'd need to be > > resolved. > > > > I note a few things: > > > > * Using the (...) syntax to indicate which parts of the file should be > > evaluated and the substituted solves the issue that David had with Jon's > > proposal re: how do you know when a node name is literal text vs. > > concatenated to some expression. > > So the M4 solution then. Erm.. use of (...) to disambiguate expressions seems an independent matter from whether we use m4 or a macro preprocessor versus in-dtc-proper expression evaluation. > > * As an aside, I wonder if we couldn't transparently allow <1 2 3> or > > <1, 2, 3> for cell list syntax, thus not requiring the brackets in > > previously proposed <(1 + 0) (1 + 1) (4 - 1)> syntax, but rather <1 + 0, > > 1 + 1, 4 - 1>? > > That's the sort of direction I advocated earlier. Hrm. I don't think this is a good idea. Having two different cell list formats seems to me to encourage confusions for minimal benefit. I think (...) will generally delimit expressions more readably anyway. Especially since it would match using that syntax to distinguish expressions in other places, like node or property names. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Some slightly random musings on device tree expression syntax [not found] ` <4F580005.403-3lzwWm7+Weoh9ZMKESR00Q@public.gmane.org> 2012-03-12 13:53 ` Jon Loeliger @ 2012-03-13 4:46 ` David Gibson [not found] ` <20120313044631.GJ24916-MK4v0fQdeXQXU02nzanrWNbf9cGiqdzd@public.gmane.org> 1 sibling, 1 reply; 6+ messages in thread From: David Gibson @ 2012-03-13 4:46 UTC (permalink / raw) To: Stephen Warren; +Cc: devicetree-discuss On Wed, Mar 07, 2012 at 05:40:37PM -0700, Stephen Warren wrote: > I was thinking some more about how to expand the device tree syntax to > allow expressions. I wondered if we should use a concept/syntax more > inspired by template processors. Playing with jinja2 and gpp led me > towards (...) being an inline expression syntax that can calculate > integers or strings and get replaced by the string representation of the > expression, and ! at the start of a line introducing a statement > context. So, below are my somewhat wandering thoughts on the matter. > However, the idea still raises a lot of questions that'd need to be > resolved. > > I note a few things: > > * Using the (...) syntax to indicate which parts of the file should be > evaluated and the substituted solves the issue that David had with Jon's > proposal re: how do you know when a node name is literal text vs. > concatenated to some expression. Yeah, I've been thinking for quite some time that using (...) to disambiguate expressions in the necessary places was te way to go. It works for cell lists, for node and property names and syntactically required parens have precedent in C if statements. I was only thinking of requireing (...) only in places where it's otherwise ambiguous. This works fairly naturally in the grammar, since C-like expression grammars usually bottom out at something like: primitive_expr := literal | identifier | '(' expr ')' ; So instead of replacing literal with expr in the celllist grammar, for example, we replace it with primitive_expr. > * Separating the device tree syntax and pre-processor/... phase allows > them to be decoupled and the pre-processor potentially optional, or even > replaced if things don't work out, or different people could use their > own thing. Ok, so, I've been leaninng towards a preprocessor for constant/macro support for some time (on the basis of the ratio of flexibility to conceptual complexity). However, I was envisaging that stage outputting (constant) expressions that were still actually evaluated by dtc. Still, if you can make a good case for expression evaluation in the pre-processor... > * As an aside, I wonder if we couldn't transparently allow <1 2 3> or > <1, 2, 3> for cell list syntax, thus not requiring the brackets in > previously proposed <(1 + 0) (1 + 1) (4 - 1)> syntax, but rather <1 + 0, > 1 + 1, 4 - 1>? As I said in another reply, I don't like this idea. It creates potentially confusing variations of the syntax for no benefit that I can see. > Concept > ======================================== > > The .dts syntax that dtc reads is unchanged. > > A pre-processing phase occurs on .dts files that handles all aspects of > expressions; all definitions, macro processing, expression process, etc. > are evaluated and fully expanded to strings during the pre-processing > phase. The result of the pre-processing phase should be a source file or > stream that can be handled by the existing dtc. > > Whether this pre-processing phase is implemented as: > * A separate executable, manually invoked by the user. > * A separate executable, automatically invoked by dtc itself. > * Something built into dtc itself. > ... is not addressed by this proposal. > > One potential issue here: if the pre-processing and regular compilation > phases are completely separate, do we need to pay attention that the > int, literal, byte-sequence literal syntax stays the same between the > two phases to reduce confusion, or not? I'm not sure quite what you're getting at here. > > Pre-processing > ======================================== > > Contexts: > > Pass-through: > > By default, the pass-through context is active. > > Data is passed from input to output without modification, except > that data is searched for markers that begin other contexts. > > Expression: > > Introduced by: ( > Terminated by: a matching ) > > The text within this context is interpreted as an expression. That > expression is evaluated, the result formatting as a string, and that > string written to the output stream in place of the ( ) markers and > the expression between them. > > Expression context can being anywhere within the source stream; no > note is taken of the tokens that the device tree language Hrm. I'm pretty dubious about doing the expression evaluation (as opposed to macro/constant expansion) within the preprocessor, then resubstituting as a string. It would work ok for integer expressions, but for bytestring expressions, it seems likely we'd have to duplicate the lexical/grammar constructs for [...], <...> and basic literals between preproc and dtc, which seems a bit horrible. In addition this approach means that an expression can never express a value which a literal couldn't. No problem in most cases, but one thing I had in mind is that an expression syntax could be used to specify a node or property name with illegal characters in it (mostly relevant for ensuring that doing -I dtb -O dts then -I dts -O will always end up exactly where you started, even when the original dtb is corrupted or otherwise contains things it shouldn't. > Statement: > > Introduced by: ! > Notes: Or some other suitable character; # conflicts with property > names unless we require it to be in the first column, and also > sounds too much much like regular cpp, so people might get > confused. @ might work. This is probably bike-shedding at this > point... > Terminated by: End of line These three states aren't quite sufficient. At the very least you need a string state, so that expressions are not expanded within " ". And we probably shouldn't be expanding them within comments, either. > Example: > > Note: // comments are used below as comments in this document, not > necessarily comments in the actual proposed syntax. > > // Simple constant definitions > // Syntax of RHS matches existing .dts syntax > > !defint usbbase 0x6000000 > !defint usbsize 0x100 > !defint usbstride 0x1000 > !defstr usb "usb" > !defbytes somebytes [de ad be ef] > > // or perhaps implicitly set variable type based on type of the RHS? > !define usbbase 0x6000000 > !define usb "usb" Hrm. If using defines is based on textual substitution, then type should be irrelevant. If they're not based on textual substitution, then the "preprocessor" is doing something rather more involved than something with that name normally would. > or !assign or !let ... > > // RHS may also use expression syntax > // and references to previously defined variables > > !defint usb3base usbbase + (2 * usbstride) > !defstr catenated usb + "2" > > // Simple use of some variables: > > (usbbase) (usbsize) (catenated) > > // which yields: > // 0x6000000 0x100 usb2 > > // A more complex example: > > (usb)3@(usb3base) { > reg = <(usb3base) (usbsize)>; > name = "(usb)3"; > }; Oh. You *intended* for expression substitution within strings. Nack, nack nackity nack. That violates least surprise seven ways to sunday. If the user wants something like this they can do: name = (usb + "3"); > // which yields: > // usb3@0x60002000 { > // reg = <0x60002000 0x100>; > // name = "usb3"; > // }; > > // Question: Do ints always format as 0x%x since that's the most common, > // or do we need explicit control over the base etc.? The user certainly shouldn't have to care what base two apparently internal parts of dtc use to talk to each other. > // Question: How do we know when to format strings with "" around them, > // e.g. for use as property values, and when not to, e.g. for use in > // arbitrary contexts? For example above, it'd be nice if when defining > // the name property, we could write 'name = usb3name;' and have it > // expand to 'name = "usb3";' given a str variable with value "usb3", > // yet we don't want the quotes when using variable usb in the node > // name in the example earlier. Yeah. This is another reason I don't think splitting the expression evaluation from the surrounding grammatical context is a good idea. > // Question: What if we actually wanted the property value "(usb3)". How > // do we stop the expansion; how to escape? > // > // I suppose the solution for the latter 2 questions is that the > // expansion has to actually be sensitive to context in the underlying > // language, and include "" in property value context, but not > // elsewhere. But, what if you write: Well, yeah, which would mean duplicating large amounts of the grammar between the expression evaluator and the rest of dtc. > !defstr nasty "usb@0x6000000 { name ="; > (nasty) (foo); > > // Additional statements could include if, for, while, ...: > > !ifdef somevar > foo bar > !else > baz qux > !endif > > // I think we don't need e.g.: > > foo !ifdef somevar! bar !else! baz !endif! qux > > // ... since I think that we can line-break in the middle of any > // property or node definition, so we could just do this instead: > > foo > !ifdef somevar > bar > !else > baz > !endif > qux > > // If we need to actually concatenate the strings into one, we can do > // that as an expression somehow, assign the result to a variable, and > // expand just that. > > !defstr xxx "foo" > !ifdef somevar > !defstr xxx xxx + "bar" > !else > !defstr xxx xxx + "baz" > !endif > !defstr xxx xxx + "qux" > (xxx) > > // Perhaps we can delimit large blocks of statements in a way that > // doesn't need a lot of !s: > > !! > xxx = "foo" > if somevar: > xxx += "bar" > else: > xxx += "baz" > xxx += "qux" > !! > (xxx) > > // Then, we can start allowing complex things like macro or function > // definitions within the !! block; a full regular language, and > // perhaps we could even borrow an existing one here. > > // About functions: Perhaps cpp-style macros: > > !define func(a, b, c) a + b + c > > // where the RHS is an expression that can use variables in the > // parameter list > // > // Or, is the RHS/body raw text, so something more like: > > !macro func(a, b, c) > foo { > prop = (a + b + c); > } > !end > > // Perhaps we need both; one with text RHS accepting escapes into > // expressions, one with an expression on the RHS. > > // I wondered if !define's RHS should always be an expression, or > // instead always be raw text with the same (...) escape to expressions > // as in regular text: > > // (assuming a, b, c are extant variables) > > // all variables are strings? > !define foo (a) + (b) + (c) > > or: > > // yields an integer variable > !defint foo a + b + c Ugh. Well, I think you've pretty much proved the case that attempting to put all the expression evaluation into the preprocessor is a really bad idea. It requires the preproc to be at least somewhat type aware which (a) is likely to lead to grammar duplication and (b) is absolutely not what someone familiar with cpp will expect. Note that evaluating *constant* expressions in dtc works very naturally into the existing structure and grammar. I certainly have no objection to that, and I don't know of anyone that did. It's storing and evaluating functions or macros in dtc proper that I'm dubious about because it requires storing partial parse trees or some other intermediate representation in a way we have never needed to before. That means a whole bunch of extra code and data structures. Now, implementing a preprocessor with (initially) similar features to cpp, but using ! instead of #, might have legs. In fact even using #-in-column-0 might be ok, but we'd want our own cpp implementation because there's no portable way of ensuring that a system cpp will only recognize # in column 0 and not elsewhere. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <20120313044631.GJ24916-MK4v0fQdeXQXU02nzanrWNbf9cGiqdzd@public.gmane.org>]
* Re: Some slightly random musings on device tree expression syntax [not found] ` <20120313044631.GJ24916-MK4v0fQdeXQXU02nzanrWNbf9cGiqdzd@public.gmane.org> @ 2012-03-13 19:56 ` Stephen Warren [not found] ` <4F5FA653.90802-3lzwWm7+Weoh9ZMKESR00Q@public.gmane.org> 0 siblings, 1 reply; 6+ messages in thread From: Stephen Warren @ 2012-03-13 19:56 UTC (permalink / raw) To: David Gibson; +Cc: devicetree-discuss On 03/12/2012 10:46 PM, David Gibson wrote: > On Wed, Mar 07, 2012 at 05:40:37PM -0700, Stephen Warren wrote: >> I was thinking some more about how to expand the device tree syntax to >> allow expressions. I wondered if we should use a concept/syntax more >> inspired by template processors. ... >> Whether this pre-processing phase is implemented as: >> * A separate executable, manually invoked by the user. >> * A separate executable, automatically invoked by dtc itself. >> * Something built into dtc itself. >> ... is not addressed by this proposal. >> >> One potential issue here: if the pre-processing and regular compilation >> phases are completely separate, do we need to pay attention that the >> int, literal, byte-sequence literal syntax stays the same between the >> two phases to reduce confusion, or not? > > I'm not sure quite what you're getting at here. Well, it's the point you make right below. Namely that if expression evaluation happens during pre-processing (either only there, or both there and during the separate final "compilation" phase), that the pre-processor must be able to parse and manipulate literals of all types, so the expressions it calculates can use values of those types. ... > Hrm. I'm pretty dubious about doing the expression evaluation (as > opposed to macro/constant expansion) within the preprocessor, then > resubstituting as a string. > > It would work ok for integer expressions, but for bytestring > expressions, it seems likely we'd have to duplicate the > lexical/grammar constructs for [...], <...> and basic literals between > preproc and dtc, which seems a bit horrible. Don't we have to allow the pre-processor to parse and manipulate constants of all types (both scalars and perhaps even complete nodes)? If we don't, then how would you do something like: var = [00 11 aa 55] for byte in var: do_something_with(byte) or: var = "Some long string" for word in var.split(): do_something_with(word) > In addition this approach means that an expression can never express a > value which a literal couldn't. No problem in most cases, but one > thing I had in mind is that an expression syntax could be used to > specify a node or property name with illegal characters in it (mostly > relevant for ensuring that doing -I dtb -O dts then -I dts -O will > always end up exactly where you started, even when the original dtb is > corrupted or otherwise contains things it shouldn't. Well, one might imagine: s = "Some text" + chr(128) That's an expression that expresses something that I think can't currently be a literal string. ... >> !defint usbbase 0x6000000 >> !defstr usb "usb" >> !defbytes somebytes [de ad be ef] >> >> // or perhaps implicitly set variable type based on type of the RHS? >> !define usbbase 0x6000000 >> !define usb "usb" > > Hrm. If using defines is based on textual substitution, then type > should be irrelevant. If they're not based on textual substitution, > then the "preprocessor" is doing something rather more involved than > something with that name normally would. True. I was more leaning to describing this as a template processor than a pre-processor. Related, my thoughts started out simpler, but became more complex and raised a lot of open questions when thinking through some of the details, so became a lot less clear! >> // A more complex example: >> >> (usb)3@(usb3base) { >> reg = <(usb3base) (usbsize)>; >> name = "(usb)3"; >> }; > > Oh. You *intended* for expression substitution within strings. Nack, > nack nackity nack. That violates least surprise seven ways to > sunday. If the user wants something like this they can do: > name = (usb + "3"); That works for the name property, but what about the node's name: (usb)3@(usb3base) { Even if we required that the whole thing be calculated elsewhere and placed into a variable, how do we know whether: foo { is meant to expand variable foo or be literal "foo"? That seemed to be one of your main objections to Jon's implementation. I proposed solving that by explicitly marking the source to indicate where expansion was desired: (foo) { or not: foo { So, () act as "start and end of expression". Given that, why not allow complete expressions with () rather than just a single variable or macro call? This is pretty much the core point of why I was referring to a templating engine rather than a pre-processor. Of course, templating engines often use e.g. <%= %> instead of ( ) or a wide variety of other syntaxes. ... > Ugh. Well, I think you've pretty much proved the case that attempting > to put all the expression evaluation into the preprocessor is a really > bad idea. It requires the preproc to be at least somewhat type aware > which (a) is likely to lead to grammar duplication and (b) is > absolutely not what someone familiar with cpp will expect. Well, I don't necessarily agree that people would be by default expecting the syntax/... must match cpp specifically; there are many many other pre-processors, macro-processors, template languages etc. out there. ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <4F5FA653.90802-3lzwWm7+Weoh9ZMKESR00Q@public.gmane.org>]
* Re: Some slightly random musings on device tree expression syntax [not found] ` <4F5FA653.90802-3lzwWm7+Weoh9ZMKESR00Q@public.gmane.org> @ 2012-03-14 14:42 ` David Gibson 0 siblings, 0 replies; 6+ messages in thread From: David Gibson @ 2012-03-14 14:42 UTC (permalink / raw) To: Stephen Warren; +Cc: devicetree-discuss On Tue, Mar 13, 2012 at 01:56:03PM -0600, Stephen Warren wrote: > On 03/12/2012 10:46 PM, David Gibson wrote: > > On Wed, Mar 07, 2012 at 05:40:37PM -0700, Stephen Warren wrote: > >> I was thinking some more about how to expand the device tree syntax to > >> allow expressions. I wondered if we should use a concept/syntax more > >> inspired by template processors. > ... > >> Whether this pre-processing phase is implemented as: > >> * A separate executable, manually invoked by the user. > >> * A separate executable, automatically invoked by dtc itself. > >> * Something built into dtc itself. > >> ... is not addressed by this proposal. > >> > >> One potential issue here: if the pre-processing and regular compilation > >> phases are completely separate, do we need to pay attention that the > >> int, literal, byte-sequence literal syntax stays the same between the > >> two phases to reduce confusion, or not? > > > > I'm not sure quite what you're getting at here. > > Well, it's the point you make right below. Namely that if expression > evaluation happens during pre-processing (either only there, or both > there and during the separate final "compilation" phase), that the > pre-processor must be able to parse and manipulate literals of all > types, so the expressions it calculates can use values of those > types. Um.. if you insist on doing the sort of very fancy stuff in the pre-processor that you're talking about. A lot of that becomes unnecessary with sufficient expression support in dtc. Especially remembering that if you have really fancy needs, you can always generate dts output from a real programming language. Or rather, put it this way. My preferred option is still a (simple!) pre-processor with reasonably rich constant expression support in dtc proper. But I prefer Jon's full-language-in-dtc approach to this full-language-in-preprocessor with very simple dtc hybrid approach - it's really the worst of both worlds. > ... > > Hrm. I'm pretty dubious about doing the expression evaluation (as > > opposed to macro/constant expansion) within the preprocessor, then > > resubstituting as a string. > > > > It would work ok for integer expressions, but for bytestring > > expressions, it seems likely we'd have to duplicate the > > lexical/grammar constructs for [...], <...> and basic literals between > > preproc and dtc, which seems a bit horrible. > > Don't we have to allow the pre-processor to parse and manipulate > constants of all types (both scalars and perhaps even complete nodes)? > If we don't, then how would you do something like: > > var = [00 11 aa 55] > for byte in var: > do_something_with(byte) > > or: > > var = "Some long string" > for word in var.split(): > do_something_with(word) Um, yeah, if you want Python, generate your dts from Python, we're not going to recreate Python within dtc, let alone within a dtc preprocessor. A pre-processor should do at most, textual macro substitution (#define), with maybe a (still textual / call-by-name) foreach construct (though even that may not be necessary if we have iteration functions). Anything that involves type awareness and it's a full language, not a pre-processor which means we should either (1) generate the dts from an existing language or (2) write the language into dtc proper so its syntax is properly merged with dtc. > > In addition this approach means that an expression can never express a > > value which a literal couldn't. No problem in most cases, but one > > thing I had in mind is that an expression syntax could be used to > > specify a node or property name with illegal characters in it (mostly > > relevant for ensuring that doing -I dtb -O dts then -I dts -O will > > always end up exactly where you started, even when the original dtb is > > corrupted or otherwise contains things it shouldn't. > > Well, one might imagine: > > s = "Some text" + chr(128) > > That's an expression that expresses something that I think can't > currently be a literal string. So the expression preprocessor can generate such a thing, but in your scheme it has no way to output it back to dtc except as a literal. Oops. Well, except the problem actually only arises for node and property names, for quoted strings in property values that can be expressed as a literal - "Some text\x80". > ... > >> !defint usbbase 0x6000000 > >> !defstr usb "usb" > >> !defbytes somebytes [de ad be ef] > >> > >> // or perhaps implicitly set variable type based on type of the RHS? > >> !define usbbase 0x6000000 > >> !define usb "usb" > > > > Hrm. If using defines is based on textual substitution, then type > > should be irrelevant. If they're not based on textual substitution, > > then the "preprocessor" is doing something rather more involved than > > something with that name normally would. > > True. I was more leaning to describing this as a template processor than > a pre-processor. Related, my thoughts started out simpler, but became > more complex and raised a lot of open questions when thinking through > some of the details, so became a lot less clear! Trickier than it seems, isn't it. There's a reason this has been discussed on and off for several years now. > >> // A more complex example: > >> > >> (usb)3@(usb3base) { > >> reg = <(usb3base) (usbsize)>; > >> name = "(usb)3"; > >> }; > > > > Oh. You *intended* for expression substitution within strings. Nack, > > nack nackity nack. That violates least surprise seven ways to > > sunday. If the user wants something like this they can do: > > name = (usb + "3"); > > That works for the name property, but what about the node's name: > > (usb)3@(usb3base) { > Even if we required that the whole thing be calculated elsewhere and > placed into a variable, how do we know whether: > > foo { > > is meant to expand variable foo or be literal "foo"? That seemed to be > one of your main objections to Jon's implementation. I proposed solving > that by explicitly marking the source to indicate where expansion was > desired: > > (foo) { > > or not: > > foo { > > So, () act as "start and end of expression". Yes. So, my thinking was that for the case of node property names, when it's given as an expression, it's a normal string expression, with quoted literals and the rest, rather than using bare strings - bare strings are seen as just a shortcut for the simple case. This is, again, incompatible with your idea of a separate expression pre-processor, because it requires awareness of the context. So: foo { and ("foo") { would be equivalent. And for the constructed example above you'd use: (usb + "3@" + usb3base) { > Given that, why not allow complete expressions with () rather than just > a single variable or macro call? Never suggested we shouldn't. But we absolutely shouldn't be using bare strings in expressions the way we do in non-expression node property names. > This is pretty much the core point of why I was referring to a > templating engine rather than a pre-processor. Of course, templating > engines often use e.g. <%= %> instead of ( ) or a wide variety of other > syntaxes. > > ... > > Ugh. Well, I think you've pretty much proved the case that attempting > > to put all the expression evaluation into the preprocessor is a really > > bad idea. It requires the preproc to be at least somewhat type aware > > which (a) is likely to lead to grammar duplication and (b) is > > absolutely not what someone familiar with cpp will expect. > > Well, I don't necessarily agree that people would be by default > expecting the syntax/... must match cpp specifically; there are many > many other pre-processors, macro-processors, template languages etc. out > there. Not perfectly, no. But the target audience of dtc are largely C programmers, the existing core syntax is C-like, and the least-surprise principle should be applied in that context. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2012-03-14 14:42 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-03-08 0:40 Some slightly random musings on device tree expression syntax Stephen Warren [not found] ` <4F580005.403-3lzwWm7+Weoh9ZMKESR00Q@public.gmane.org> 2012-03-12 13:53 ` Jon Loeliger [not found] ` <E1S75gQ-0005WK-0D-CYoMK+44s/E@public.gmane.org> 2012-03-12 23:57 ` David Gibson 2012-03-13 4:46 ` David Gibson [not found] ` <20120313044631.GJ24916-MK4v0fQdeXQXU02nzanrWNbf9cGiqdzd@public.gmane.org> 2012-03-13 19:56 ` Stephen Warren [not found] ` <4F5FA653.90802-3lzwWm7+Weoh9ZMKESR00Q@public.gmane.org> 2012-03-14 14:42 ` David Gibson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).