From mboxrd@z Thu Jan 1 00:00:00 1970 From: Benoit Cousson Subject: Re: [RFC 00/15] Device Tree schemas and validation Date: Thu, 03 Oct 2013 15:17:57 +0200 Message-ID: <524D6E85.4000002@baylibre.com> References: <1380041541-17529-1-git-send-email-bcousson@baylibre.com> <524B4B20.4020002@wwwdotorg.org> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <524B4B20.4020002@wwwdotorg.org> Sender: linux-omap-owner@vger.kernel.org To: Stephen Warren Cc: olof@lixom.net, devicetree@vger.kernel.org, tomasz.figa@gmail.com, grant.likely@secretlab.ca, rob.herring@calxeda.com, khilman@linaro.org, linux-omap@vger.kernel.org, linux-arm-kernel@lists.infradead.org, fparent@baylibre.com List-Id: devicetree@vger.kernel.org Hi Stephen, On 02/10/2013 00:22, Stephen Warren wrote: > On 09/24/2013 10:52 AM, Benoit Cousson wrote: >> Hi All, >> >> Following the discussion that happened during LCE-2013 and the email >> thread started by Tomasz few months ago [1], here is a first attempt >> to introduce: >> - a schema language to define the bindings accurately >> - DTS validation during device tree compilation in DTC itself > > Sorry, this is probably going to sound a bit negative. Hopefully you > find it constructive though. Well, I hope so too, let's see at the end of the email :-) >> The syntax for a schema is the same as the one for dts. This choice has >> been made to simplify its development, to maximize the code reuse and >> finally because the format is human-readable. > > I'm not convinced that's a good decision. Me neither :-), but I gave the current rational... > DT is a language for representing data. > > The validation checks described by schemas are rules, or code, and not > static data. I will not be that strict with what DTS is supposed to do. In that aspect DTS is just a way to represent information in a structured hierarchical way. It is for my point of view no different than XML. I know that everybody hate XML, including me, but that shows at least what is doable with such language. The fact that we like it or not is a different topic. > So, while I'm sure it's possible to shoe-horn at least some reasonable > subset of DT validation into DT syntax itself, I feel it's unlikely to > yield something that's scalable enough. I don't think we have any limit with such representation. My concern will be more the readability. To be honest, a language choice is by nature completely subjective, and nobody will have the same taste. So we can spend weeks arguing about that :-) > For example, it's easy to specify that a property must be 2 cells long. > What if it could be any multiple of two? That's a lot of numbers to > explicitly enumerate as data. Sure, you can then invent syntax to > represent that specific rule (parameterized by 2), but what about the > next similar-but-different rule? The only approach I can think of to > that is to allow the schema to contain arbitrary expressions, which > would likely need to morph into arbitary statements not just > expressions. Once you're there, I think the schema would be better > represented as a programming language rather than as a data structure > that could have code hooked into it. Sure, but how many complex cases like that do we have? I guess, we can handle all the use-cases required by Rob with the current syntax. Let's assume we cover 99% of the use-cases with such language, do we really want to have a super complex language just for the corner cases? Potentially, writing a C extension to DTC is still possible for that specific case. Not ideal, I do agree, but we have to be pragmatic as well. We really need to understand how scalable we have to be before deciding that the current representation is not good enough. >> How to: >> * Associate a schema to one or several nodes >> >> As said earlier a schema can be used to validate one or several nodes >> from a dts. To do this the "compatible" properties from the nodes which >> should be validated must be present in the schema. >> >> timer1: timer@4a318000 { >> compatible = "ti,omap3430-timer"; > ... >> To write a schema which will validate OMAP Timers like the one above, >> one may write the following schema: >> >> /dts-v1/; >> / { >> compatible = "ti,omap[0-9]+-timer"; > > What about DT nodes that don't have a compatible value? We certainly > have some of those already like /memory and /chosen. We should be able > to validate their schema too. This probably doesn't invalidate being > able to look things up by compatible value though; it just means we need > some additional mechanisms too. Yes, that's a good point and easy to add as well. >> * Define constraints on properties >> >> To define constraints on a property one has to create a node in a schema >> which has as name the name of the property that one want to validate. >> >> To specify constraints on the property "ti,hwmods" of OMAP Timers one >> can write this schema: >> >> /dts-v1/; >> / { >> compatible = "ti,omap[0-9]+-timer"; >> ti,hwmods { >> ... >> }; > > compatible and ti,hwmods are both properties in the DT file. However, in > the schema above, one appears as a property, and one as a node. I don't > like that inconsistency. It'd be better if compatible was a node too. That's already possible, you can check the timer.schema. The point is to simplify the representation for simple case and use a attribute instead of a node. But that will make 2 different representation for the same case, which might not be that good. >> If one want to use a regular as property name one can write this schema: >> >> /dts-v1/; >> / { >> compatible = "abc"; >> def { >> name = "def[0-9]"; > > Isn't it valid to have a property named "name" within the node itself? > How do you differentiate between specifying the node name and the name > property? You don't have to. In this case the attributes inside the node are strictly the schema language keywords. That being said, it might happen for some other casea, so maybe a prefix like "schema,XXX" should be use to create a proper namespace. > What if the node name needs more validation than just a regex. For > example, suppose we want to validate the > unit-name-must-match-reg-address rule. We need to write some complex > expression using data extracted from reg to calculate the unit address. > Equally, the node name perhaps has to exist in some global list of > acceptable node names. It would be extremely tricky if not impossible to > do that with a regex. Sure, but again, do we have such cases already? How far do we want to go in term of complexity for corner cases. >> ... >> }; >> }; >> >> Above one can see that the "name" property override the node name. > > Override implies that dtc would change the node name during compilation. > I think s/override/validate/ or s/override/overrides the validation > rules for/? OK >> * Require the presence of a property inside a node or inside one of its >> parents > ... >> /dts-v1/; >> / { >> compatible = "ti,twl[0-9]+-rtc"; >> interrupt-controller { >> is-required; >> can-be-inherited; > > interrupt-controller isn't a good example here, since it isn't a > property that would typically be inherited. Why not use interrupt-parent > instead? Yeah, that's a mistake, it should have been interrupt-parent. It was done for that attribute mainly. >> One can check if 'node' has the following subnode 'subnode1', 'subnode2', >> and 'abc' with the schema below: >> >> /dts-v1/; >> / { >> compatible = "comp"; >> children = "abc", "subnode[0-9]"; >> }; > > How is the schema for each sub-node specified? sub-node are handled like any other regular node. If needed you can set constraints on parent node using the parents keyword. > What if some nodes are optional and some required? The conditions where > a sub-node is required might be complex, and I think we'd always want to > be able to represent them in whatever schema language we chose. > > The most obvious way would be to make each sub-node's schema appear as a > sub-node within the main node's schema, but then how do you tell if a > schema node describes a property or a node? I'm not sure about that. That was my first impression as well when we started, but in fact I don't think this is really needed. By doing that, you end up creating a schema that looks like your final dts. So this become some kind of template more than a schema. > Note that the following DT file is currently accepted by dtc even if it > may not be the best choice of property and node names: > > ========== > /dts-v1/; > > / { > foo = <1>; > foo {}; > }; > ========== > >> * Constraints on array size >> >> One can specify the following constraints on array size: >> - length: specify the exact length that an array must have. >> - min-length: specify the minimum number of elements an array must have. >> - max-length: specify the maximum number of elements an array must have. > > This seems rather inflexible; it'll cover a lot of the simple cases, but > hit a wall pretty soon. For example, how would it validate a property > that is supposed to include 3 GPIO specifiers, where the GPIO specifiers > are going to have DT-specific lengths, since the length of each > specifier is defined by the node that the phandles reference? sure, but that kind of check can be added. > Overall, I believe perhaps the single most important aspect of any DT > schema is schema inheritance or instancing, and this proposal doesn't > appear to address that issue at all. It does not handle inheritance completely, but that's not necessarily needed for the cases you describe below. > Inheritance of schemas: > > For example, any node that is addressed must contain a reg property. The > constraints on that property are identical in all bindings; it must > consist of #address-cells + #size-cells integer values (cells). We don't > want to have to cut/paste that rule into every single binding > definition. Rather, we should simply say something like "this binding > uses the reg property", and the schema validation tool will look up the > definition of "reg property", and hence know how to validate it. That's almost doable with the current mechanism and part of the plan. You can already add a generic rule that will apply to every nodes thanks to a wildcard RE. Then later you can add a more specific rule that will apply to few nodes only. But I realized, we did not even used that in the example we did :-( > Similarly, any binding that describes a GPIO controller will have some > similar requirements; the gpio-controller and #gpio-cells properties > must be present. The schema should simply say "I'm a GPIO controller", > and the schema tool should add some extra requirements to nodes of that > type. Yes, agreed. Should be doable using previous mechanism. But will need some improvement. > Instancing of schemas: > > Any binding that uses GPIOs should be able to say that a particular > property (e.g. "enable-gpios") is-a GPIO-specifier (with parameters > "enable" for the property name, min/max/expression length, etc.), and > then the schema validation tool would know to apply rules for a > specifier list to that property (and be able to check the property name). > Thanks for your comments, that are indeed really good and constructive. Benoit