From mboxrd@z Thu Jan  1 00:00:00 1970
From: Wolfgang Grandegger <wg@grandegger.com>
Date: Thu, 26 Apr 2007 09:39:32 +0200
Subject: [U-Boot-Users] LIBFDT: first version of fdt_find_compatible_node
In-Reply-To: <462FC2BE.9070309@smiths-aerospace.com>
References: <462F1831.7070902@grandegger.com>
	<462F6FDC.8020804@smiths-aerospace.com>
	<462FADF0.9020008@grandegger.com>
	<462FC2BE.9070309@smiths-aerospace.com>
Message-ID: <46305734.70909@grandegger.com>
List-Id: <u-boot.lists.denx.de>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
To: u-boot@lists.denx.de

Jerry Van Baren wrote:
> Wolfgang Grandegger wrote:
>> Jerry Van Baren wrote:
>>> Wolfgang Grandegger wrote:
>>>> Hello,
>>>>
>>>> attached you can find a patch implementing fdt_find_compatible_node():
> 
> [snip]
> 
>>> For the above version of fdt_find_compatible_node(), as well as the 
>>> one that fills the cache table (snipped), I'm thinking it would be 
>>> better to use the function I added:
>>>
>>> uint32_t fdt_next_tag(const void *fdt, int offset, int *nextoffset, 
>>> char **namep)
>>>
>>> I added this to step through the nodes and properties for dumping the 
>>> tree.  Rather than having Yet Another (slightly modified) Copy of the 
>>> loop & switch, you should be able to use fdt_next_tag() to step 
>>> through the nodes and properties, doing the
>>>   if (streq(fdt_string(fdt, namestroff), "device_type"))
>>> on the **namep parameter on every call to find the device_type 
>>> properties.  Does this make sense?
>>
>> Yes, my concern was the overhead due to looking up the location of 
>> each node name and property. But for fast scanning, the cached version 
>> is much better anyhow, I think.
>>
>> What is your opinion on the cached version? Do we need it? Should we 
>> keep both?
>>
>> Wolfgang.
> 
> Hi wg,
> 
> Yes, blob parsing will be done from the start of the blob until an 
> answer is found every time a question is asked of it.  Not a paradigm of 
> efficiency. :-/
> 
> WRT the cached version, I have doubts about how much time it will save 
> since I expect the "find compatible" will only be used during 
> initialization.  Is it worth optimizing?  Really slow memory - yes. Fast 
> memory - I doubt it.
> a) I don't picture blobs being stored in really slow memory (no i2c 
> memories).
> b) If the memory really is slow, it seems like it would be as good or 
> better to copy the blob to RAM and use it out of RAM (but there may be 
> chicken & egg problems with that - I don't know how deeply you are 
> looking to embed this).
> 
> I don't know what board/processor/memory you are ultimately targeting 
> with this, so my criticisms may not be valid.  I know denx.de 
> support(s|ed) some very slow to boot boards that lots of tricks were 
> done WRT optimization of env variables because they were stored in i2c 
> memory.

I'm doing that for a MPC823 at 50 MHz, a very low-end system, and almost 
to slow for 2.6. I will do some real measurements when time permits to 
get a better feeling.

> --------
> 
> Other puzzlements:
> 
> You are storing "level" in the cache table, I don't see why.  That is 
> only used while scanning the blob to keep track of node nesting and 
> unnesting and exit when we've unnested back out to the original level. 
> Nesting isn't really applicable when it is in the cache.
> 
> In fdt_find_compatible_node() you have local static variables:
> +    static int level, typefound;
> +    static int nodeoffset, nextoffset;
> and I don't understand why.  I expected them to be auto variables, 
> initialized on entry and discarded on exit.

This is to continue a scan started with startoffset=0.

  *   startoffset  - the node to start searching from or 0, the node
  *                  you pass will not be searched, only the next one
  *                  will; typically, you pass 0 to start the search
  *                  and then what the previous call returned.

And it could be used in the following way:

  offset = 0;
  do {
      offset = fdt_find_compatible_node(fdt, offset, "type", "comp");
  } while (offset >= 0);

This is to be compatible with the Linux version (prom.c) and to scan for 
more than one compatible device efficiently (without re-scanning). For 
example, for the MPC5200 there are 8 compatible GPTs defined. To do so, 
we must preserve nextoffset and level. I also preserve typefound and 
nodeoffset for efficiency reasons and to check sub-sequent calls of 
fdt_find_compatible_node().

> Looking closer at your cached version, I see you really are using 
> nodeoffset as a persistent variable there and I cringe a bit.  That 
> implementation presumes you ask questions sequentially or always reset 
> to 0.  This would prevent you from, for instance, searching for the 
> "/soc8360 at ..." node using fdt_path_offset() and then looking for devices 
> inside that node.  Given what is being represented and how, perhaps I'm 
> creating an unreasonable strawman.

Yes that's the current behaviour for both versions. I thought a scan 
will always be started with 0. Nevertheless, as I see it, the Linux 
version does not use the hierarchy but just scans all device (nodes) 
sequentially. But this could be changed easily and I was already 
thinking to do so.

> Instead of using a static variable, you could scan the cache table for 
> an element whose stored offset is greater than or equal to startoffset, 
> and search the table forward from there (the offsets in the table will 
> be monotonically increasing because of how you build the table). 
> Slightly less efficient, but it will still be pretty good and much 
> better than scanning the whole blob... or maybe I'm complaining based on 
> an unreasonable strawman.

Slightly less efficient, yes, that was my motivation. In U-Boot we 
access the libfdt always sequentially and therefore the trick with 
static variables for efficiency seems OK to me.

> After writing all the above, it bothers me a bit that the hierarchical 
> tree structure of the device tree is getting stuck into a one 
> dimensional cache (the hierarchy is lost), but I cannot think at this 
> point why that would bite us down the road.

See above. I have to check my assumption, though. Let's  clarify first 
the intended behavior of fdt_find_compatible_node().

Wolfgang.