[U-Boot-Users] LIBFDT: first version of fdt_find_compatible

public inbox for u-boot@lists.denx.de
 help / color / mirror / Atom feed

* [U-Boot-Users] LIBFDT: first version of fdt_find_compatible_node
@ 2007-04-25  8:58 Wolfgang Grandegger
  2007-04-25 15:12 ` Jerry Van Baren
  0 siblings, 1 reply; 8+ messages in thread
From: Wolfgang Grandegger @ 2007-04-25  8:58 UTC (permalink / raw)
  To: u-boot

Hello,

attached you can find a patch implementing fdt_find_compatible_node():

/*
  * Find a node based on its device type and one of the tokens in
  * its its "compatible" property. On success, the offset of that
  * node is returned or an error code:
  *
  *   startoffset  - the node to start searching from or 0, the node
  *                  you pass will not be searched, only the next one
  *                  will; typically, you pass 0 to start the search
  *                  and then what the previous call returned.
  *   type         - the device type string to match against
  *   compat       - the string to match to one of the tokens
  *                  in the "compatible" list.
  */

It should be used as shown below:

     offset = 0;
     do {
         offset = fdt_find_compatible_node(fdt, offset, "type", "comp");
     } while (offset >= 0);

This first hack also implements a cached version as alternative, because 
tag re-scanning might take quite long. In principle, the cache could 
also be used for other functions, like fdt_path_offset(), and could be 
invalidated in case the FDT gets updated.

What do you think?

Thanks.

Wolfgang.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: u-boot-fdt-find-compatible.patch
Type: text/x-patch
Size: 6384 bytes
Desc: not available
Url : http://lists.denx.de/pipermail/u-boot/attachments/20070425/c8d036c1/attachment.bin 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [U-Boot-Users] LIBFDT: first version of fdt_find_compatible_node
  2007-04-25  8:58 [U-Boot-Users] LIBFDT: first version of fdt_find_compatible_node Wolfgang Grandegger
@ 2007-04-25 15:12 ` Jerry Van Baren
  2007-04-25 19:37   ` Wolfgang Grandegger
  0 siblings, 1 reply; 8+ messages in thread
From: Jerry Van Baren @ 2007-04-25 15:12 UTC (permalink / raw)
  To: u-boot

Wolfgang Grandegger wrote:
> Hello,
> 
> attached you can find a patch implementing fdt_find_compatible_node():

Yeeee-ha! :-)

> /*
>  * Find a node based on its device type and one of the tokens in
>  * its its "compatible" property. On success, the offset of that
>  * node is returned or an error code:
>  *
>  *   startoffset  - the node to start searching from or 0, the node
>  *                  you pass will not be searched, only the next one
>  *                  will; typically, you pass 0 to start the search
>  *                  and then what the previous call returned.
>  *   type         - the device type string to match against
>  *   compat       - the string to match to one of the tokens
>  *                  in the "compatible" list.
>  */
> 
> It should be used as shown below:
> 
>     offset = 0;
>     do {
>         offset = fdt_find_compatible_node(fdt, offset, "type", "comp");
>     } while (offset >= 0);
> 
> This first hack also implements a cached version as alternative, because 
> tag re-scanning might take quite long. In principle, the cache could 
> also be used for other functions, like fdt_path_offset(), and could be 
> invalidated in case the FDT gets updated.
> 
> What do you think?
> 
> Thanks.
> 
> Wolfgang.

Looks good.  In the real patch, I would like to see the cache addition 
split from the fdt_node_is_compatible() fdt_find_compatible_node() and 
addition.

> ------------------------------------------------------------------------
> 
> diff --git a/include/libfdt.h b/include/libfdt.h
> index f8bac73..3fd7c9f 100644
> --- a/include/libfdt.h
> +++ b/include/libfdt.h
> @@ -89,6 +89,12 @@ uint32_t fdt_next_tag(const void *fdt, int offset,
>  int fdt_num_reservemap(void *fdt, int *used, int *total);
>  int fdt_get_reservemap(void *fdt, int n, struct fdt_reserve_entry *re);
>  
> +int fdt_node_is_compatible(const void *fdt, int nodeoffset,
> +			   const char *compat);
> +
> +int fdt_find_compatible_node(const void *fdt, int startoffset,
> +			     const char *type, const char *compat);
> +
>  /* Write-in-place functions */
>  int fdt_setprop_inplace(void *fdt, int nodeoffset, const char *name,
>  			const void *val, int len);
> diff --git a/libfdt/fdt_ro.c b/libfdt/fdt_ro.c
> index 4e2c325..65ede88 100644
> --- a/libfdt/fdt_ro.c
> +++ b/libfdt/fdt_ro.c
> @@ -55,6 +55,236 @@ char *fdt_string(const void *fdt, int stroffset)
>  	return (char *)fdt + fdt_off_dt_strings(fdt) + stroffset;
>  }
>  
> +#define CONFIG_OF_LIBFDT_TABLE 64
> +#if CONFIG_OF_LIBFDT_TABLE > 0

[snip]

> +#endif  /* CONFIG_OF_LIBFDT_TABLE > 0 */
> +
> +/*
> + * Check if the specified node is compatible by comparing the
> + * tokens in its "compatible" property with the specified string:
> + *
> + *   nodeoffset - starting place of the node
> + *   compat     - the string to match to one of the tokens
> + *                in the "compatible" list.
> + */
> +int fdt_node_is_compatible(const void *fdt, int nodeoffset,
> +			   const char *compat)
> +{
> +	const char* cp;
> +	int cplen, l;
> +
> +	cp = fdt_getprop(fdt, nodeoffset, "compatible", &cplen);
> +	if (cp == NULL)
> +		return 0;
> +	while (cplen > 0) {
> +		if (strncmp(cp, compat, strlen(compat)) == 0)
> +			return 1;
> +		l = strlen(cp) + 1;
> +		cp += l;
> +		cplen -= l;
> +	}
> +
> +	return 0;
> +}

I see this came directly from arch/powerpc/kernel/prom.c, but using "l" 
for a variable is evil.  For a minute, I was wondering how the compiler 
was compiling "1" (one) as a lvalue.  I would prefer it to be "len" or 
something more descriptive.

> +/*
> + * Find a node based on its device type and one of the tokens in
> + * its its "compatible" property. On success, the offset of that
> + * node is return or an error code:
> + *
> + *   startoffset  - the node to start searching from or 0, the node
> + *                  you pass will not be searched, only the next one
> + *                  will; typically, you pass 0 to start the search
> + *                  and then what the previous call returned.
> + *   type         - the device type string to match against
> + *   compat       - the string to match to one of the tokens
> + *                  in the "compatible" list.
> + */
> +#if CONFIG_OF_LIBFDT_TABLE > 0

[snip]

> +#else  /* CONFIG_OF_LIBFDT_TABLE <= 0 */
> +int fdt_find_compatible_node(const void *fdt, int startoffset,
> +			     const char *type, const char *compat)
> +{
> +	static int level, typefound;
> +	static int nodeoffset, nextoffset;
> +	int offset, namestroff;
> +	struct fdt_property *prop;
> +	uint32_t tag;
> +
> +	CHECK_HEADER(fdt);
> +
> +	if (startoffset == 0) {
> +		level = 0;
> +		tag = fdt_next_tag(fdt, 0, &nextoffset, NULL);
> +		if (tag != FDT_BEGIN_NODE)
> +			return -FDT_ERR_BADOFFSET;
> +	} else if (startoffset != nodeoffset)
> +		return -FDT_ERR_BADOFFSET;
> +
> +	do {
> +		offset = nextoffset;
> +		tag = fdt_next_tag(fdt, offset, &nextoffset, NULL);
> +
> +		switch (tag) {
> +		case FDT_END:
> +			return -FDT_ERR_TRUNCATED;
> +
> +		case FDT_BEGIN_NODE:
> +			level++;
> +			nodeoffset = offset;
> +			typefound = 0;
> +			break;
> +
> +		case FDT_END_NODE:
> +			level--;
> +			break;
> +
> +		case FDT_PROP:
> +			if (typefound)
> +				continue;
> +
> +			prop = fdt_offset_ptr_typed(fdt, offset, prop);
> +			if (! prop)
> +				continue;
> +			namestroff = fdt32_to_cpu(prop->nameoff);
> +			if (streq(fdt_string(fdt, namestroff), "device_type")) {
> +				int len = fdt32_to_cpu(prop->len);
> +				typefound = 0;
> +				prop = fdt_offset_ptr(fdt, offset,
> +						      sizeof(*prop)+len);
> +				if (! prop)
> +					continue;
> +				if (strncasecmp(prop->data, type, len - 1) == 0 &&
> +				    fdt_node_is_compatible(fdt, nodeoffset, compat))
> +					return nodeoffset;
> +			}
> +			break;
> +
> +		case FDT_NOP:
> +			break;
> +
> +		default:
> +			return -FDT_ERR_BADSTRUCTURE;
> +		}
> +	} while (level >= 0);
> +
> +	return -FDT_ERR_NOTFOUND;
> +}
> +#endif  /* CONFIG_OF_LIBFDT_TABLE > 0 */
> +
>  /*
>   * Return the node offset of the node specified by:
>   *   parentoffset - starting place (0 to start at the root)

For the above version of fdt_find_compatible_node(), as well as the one 
that fills the cache table (snipped), I'm thinking it would be better to 
use the function I added:

uint32_t fdt_next_tag(const void *fdt, int offset, int *nextoffset, char 
**namep)

I added this to step through the nodes and properties for dumping the 
tree.  Rather than having Yet Another (slightly modified) Copy of the 
loop & switch, you should be able to use fdt_next_tag() to step through 
the nodes and properties, doing the
   if (streq(fdt_string(fdt, namestroff), "device_type"))
on the **namep parameter on every call to find the device_type 
properties.  Does this make sense?

Thanks,
gvb

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [U-Boot-Users] LIBFDT: first version of fdt_find_compatible_node
  2007-04-25 15:12 ` Jerry Van Baren
@ 2007-04-25 19:37   ` Wolfgang Grandegger
  2007-04-25 21:06     ` Jerry Van Baren
  0 siblings, 1 reply; 8+ messages in thread
From: Wolfgang Grandegger @ 2007-04-25 19:37 UTC (permalink / raw)
  To: u-boot

Jerry Van Baren wrote:
> Wolfgang Grandegger wrote:
>> Hello,
>>
>> attached you can find a patch implementing fdt_find_compatible_node():
> 
> Yeeee-ha! :-)
> 
>> /*
>>  * Find a node based on its device type and one of the tokens in
>>  * its its "compatible" property. On success, the offset of that
>>  * node is returned or an error code:
>>  *
>>  *   startoffset  - the node to start searching from or 0, the node
>>  *                  you pass will not be searched, only the next one
>>  *                  will; typically, you pass 0 to start the search
>>  *                  and then what the previous call returned.
>>  *   type         - the device type string to match against
>>  *   compat       - the string to match to one of the tokens
>>  *                  in the "compatible" list.
>>  */
>>
>> It should be used as shown below:
>>
>>     offset = 0;
>>     do {
>>         offset = fdt_find_compatible_node(fdt, offset, "type", "comp");
>>     } while (offset >= 0);
>>
>> This first hack also implements a cached version as alternative, 
>> because tag re-scanning might take quite long. In principle, the cache 
>> could also be used for other functions, like fdt_path_offset(), and 
>> could be invalidated in case the FDT gets updated.
>>
>> What do you think?
>>
>> Thanks.
>>
>> Wolfgang.
> 
> Looks good.  In the real patch, I would like to see the cache addition 
> split from the fdt_node_is_compatible() fdt_find_compatible_node() and 
> addition.

Yes, OK, the patch needs some cleanup and improvement.

[...]
>> +#endif  /* CONFIG_OF_LIBFDT_TABLE > 0 */
>> +
>> +/*
>> + * Check if the specified node is compatible by comparing the
>> + * tokens in its "compatible" property with the specified string:
>> + *
>> + *   nodeoffset - starting place of the node
>> + *   compat     - the string to match to one of the tokens
>> + *                in the "compatible" list.
>> + */
>> +int fdt_node_is_compatible(const void *fdt, int nodeoffset,
>> +               const char *compat)
>> +{
>> +    const char* cp;
>> +    int cplen, l;
>> +
>> +    cp = fdt_getprop(fdt, nodeoffset, "compatible", &cplen);
>> +    if (cp == NULL)
>> +        return 0;
>> +    while (cplen > 0) {
>> +        if (strncmp(cp, compat, strlen(compat)) == 0)
>> +            return 1;
>> +        l = strlen(cp) + 1;
>> +        cp += l;
>> +        cplen -= l;
>> +    }
>> +
>> +    return 0;
>> +}
> 
> I see this came directly from arch/powerpc/kernel/prom.c, but using "l" 
> for a variable is evil.  For a minute, I was wondering how the compiler 
> was compiling "1" (one) as a lvalue.  I would prefer it to be "len" or 
> something more descriptive.

The code looks strange indeed. Will use "len" instead of "1", aheee "l".

[...]

> For the above version of fdt_find_compatible_node(), as well as the one 
> that fills the cache table (snipped), I'm thinking it would be better to 
> use the function I added:
> 
> uint32_t fdt_next_tag(const void *fdt, int offset, int *nextoffset, char 
> **namep)
> 
> I added this to step through the nodes and properties for dumping the 
> tree.  Rather than having Yet Another (slightly modified) Copy of the 
> loop & switch, you should be able to use fdt_next_tag() to step through 
> the nodes and properties, doing the
>   if (streq(fdt_string(fdt, namestroff), "device_type"))
> on the **namep parameter on every call to find the device_type 
> properties.  Does this make sense?

Yes, my concern was the overhead due to looking up the location of each 
node name and property. But for fast scanning, the cached version is 
much better anyhow, I think.

What is your opinion on the cached version? Do we need it? Should we 
keep both?

Wolfgang.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [U-Boot-Users] LIBFDT: first version of fdt_find_compatible_node
  2007-04-25 19:37   ` Wolfgang Grandegger
@ 2007-04-25 21:06     ` Jerry Van Baren
  2007-04-26  7:39       ` Wolfgang Grandegger
  0 siblings, 1 reply; 8+ messages in thread
From: Jerry Van Baren @ 2007-04-25 21:06 UTC (permalink / raw)
  To: u-boot

Wolfgang Grandegger wrote:
> Jerry Van Baren wrote:
>> Wolfgang Grandegger wrote:
>>> Hello,
>>>
>>> attached you can find a patch implementing fdt_find_compatible_node():

[snip]

>> For the above version of fdt_find_compatible_node(), as well as the 
>> one that fills the cache table (snipped), I'm thinking it would be 
>> better to use the function I added:
>>
>> uint32_t fdt_next_tag(const void *fdt, int offset, int *nextoffset, 
>> char **namep)
>>
>> I added this to step through the nodes and properties for dumping the 
>> tree.  Rather than having Yet Another (slightly modified) Copy of the 
>> loop & switch, you should be able to use fdt_next_tag() to step 
>> through the nodes and properties, doing the
>>   if (streq(fdt_string(fdt, namestroff), "device_type"))
>> on the **namep parameter on every call to find the device_type 
>> properties.  Does this make sense?
> 
> Yes, my concern was the overhead due to looking up the location of each 
> node name and property. But for fast scanning, the cached version is 
> much better anyhow, I think.
> 
> What is your opinion on the cached version? Do we need it? Should we 
> keep both?
> 
> Wolfgang.

Hi wg,

Yes, blob parsing will be done from the start of the blob until an 
answer is found every time a question is asked of it.  Not a paradigm of 
efficiency. :-/

WRT the cached version, I have doubts about how much time it will save 
since I expect the "find compatible" will only be used during 
initialization.  Is it worth optimizing?  Really slow memory - yes. 
Fast memory - I doubt it.
a) I don't picture blobs being stored in really slow memory (no i2c 
memories).
b) If the memory really is slow, it seems like it would be as good or 
better to copy the blob to RAM and use it out of RAM (but there may be 
chicken & egg problems with that - I don't know how deeply you are 
looking to embed this).

I don't know what board/processor/memory you are ultimately targeting 
with this, so my criticisms may not be valid.  I know denx.de 
support(s|ed) some very slow to boot boards that lots of tricks were 
done WRT optimization of env variables because they were stored in i2c 
memory.

--------

Other puzzlements:

You are storing "level" in the cache table, I don't see why.  That is 
only used while scanning the blob to keep track of node nesting and 
unnesting and exit when we've unnested back out to the original level. 
Nesting isn't really applicable when it is in the cache.

In fdt_find_compatible_node() you have local static variables:
+	static int level, typefound;
+	static int nodeoffset, nextoffset;
and I don't understand why.  I expected them to be auto variables, 
initialized on entry and discarded on exit.

Looking closer at your cached version, I see you really are using 
nodeoffset as a persistent variable there and I cringe a bit.  That 
implementation presumes you ask questions sequentially or always reset 
to 0.  This would prevent you from, for instance, searching for the 
"/soc8360 at ..." node using fdt_path_offset() and then looking for devices 
inside that node.  Given what is being represented and how, perhaps I'm 
creating an unreasonable strawman.

Instead of using a static variable, you could scan the cache table for 
an element whose stored offset is greater than or equal to startoffset, 
and search the table forward from there (the offsets in the table will 
be monotonically increasing because of how you build the table). 
Slightly less efficient, but it will still be pretty good and much 
better than scanning the whole blob... or maybe I'm complaining based on 
an unreasonable strawman.

After writing all the above, it bothers me a bit that the hierarchical 
tree structure of the device tree is getting stuck into a one 
dimensional cache (the hierarchy is lost), but I cannot think at this 
point why that would bite us down the road.

Best regards,
gvb

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [U-Boot-Users] LIBFDT: first version of fdt_find_compatible_node
  2007-04-25 21:06     ` Jerry Van Baren
@ 2007-04-26  7:39       ` Wolfgang Grandegger
  2007-04-26  9:42         ` Wolfgang Grandegger
  0 siblings, 1 reply; 8+ messages in thread
From: Wolfgang Grandegger @ 2007-04-26  7:39 UTC (permalink / raw)
  To: u-boot

Jerry Van Baren wrote:
> Wolfgang Grandegger wrote:
>> Jerry Van Baren wrote:
>>> Wolfgang Grandegger wrote:
>>>> Hello,
>>>>
>>>> attached you can find a patch implementing fdt_find_compatible_node():
> 
> [snip]
> 
>>> For the above version of fdt_find_compatible_node(), as well as the 
>>> one that fills the cache table (snipped), I'm thinking it would be 
>>> better to use the function I added:
>>>
>>> uint32_t fdt_next_tag(const void *fdt, int offset, int *nextoffset, 
>>> char **namep)
>>>
>>> I added this to step through the nodes and properties for dumping the 
>>> tree.  Rather than having Yet Another (slightly modified) Copy of the 
>>> loop & switch, you should be able to use fdt_next_tag() to step 
>>> through the nodes and properties, doing the
>>>   if (streq(fdt_string(fdt, namestroff), "device_type"))
>>> on the **namep parameter on every call to find the device_type 
>>> properties.  Does this make sense?
>>
>> Yes, my concern was the overhead due to looking up the location of 
>> each node name and property. But for fast scanning, the cached version 
>> is much better anyhow, I think.
>>
>> What is your opinion on the cached version? Do we need it? Should we 
>> keep both?
>>
>> Wolfgang.
> 
> Hi wg,
> 
> Yes, blob parsing will be done from the start of the blob until an 
> answer is found every time a question is asked of it.  Not a paradigm of 
> efficiency. :-/
> 
> WRT the cached version, I have doubts about how much time it will save 
> since I expect the "find compatible" will only be used during 
> initialization.  Is it worth optimizing?  Really slow memory - yes. Fast 
> memory - I doubt it.
> a) I don't picture blobs being stored in really slow memory (no i2c 
> memories).
> b) If the memory really is slow, it seems like it would be as good or 
> better to copy the blob to RAM and use it out of RAM (but there may be 
> chicken & egg problems with that - I don't know how deeply you are 
> looking to embed this).
> 
> I don't know what board/processor/memory you are ultimately targeting 
> with this, so my criticisms may not be valid.  I know denx.de 
> support(s|ed) some very slow to boot boards that lots of tricks were 
> done WRT optimization of env variables because they were stored in i2c 
> memory.

I'm doing that for a MPC823 at 50 MHz, a very low-end system, and almost 
to slow for 2.6. I will do some real measurements when time permits to 
get a better feeling.

> --------
> 
> Other puzzlements:
> 
> You are storing "level" in the cache table, I don't see why.  That is 
> only used while scanning the blob to keep track of node nesting and 
> unnesting and exit when we've unnested back out to the original level. 
> Nesting isn't really applicable when it is in the cache.
> 
> In fdt_find_compatible_node() you have local static variables:
> +    static int level, typefound;
> +    static int nodeoffset, nextoffset;
> and I don't understand why.  I expected them to be auto variables, 
> initialized on entry and discarded on exit.

This is to continue a scan started with startoffset=0.

  *   startoffset  - the node to start searching from or 0, the node
  *                  you pass will not be searched, only the next one
  *                  will; typically, you pass 0 to start the search
  *                  and then what the previous call returned.

And it could be used in the following way:

  offset = 0;
  do {
      offset = fdt_find_compatible_node(fdt, offset, "type", "comp");
  } while (offset >= 0);

This is to be compatible with the Linux version (prom.c) and to scan for 
more than one compatible device efficiently (without re-scanning). For 
example, for the MPC5200 there are 8 compatible GPTs defined. To do so, 
we must preserve nextoffset and level. I also preserve typefound and 
nodeoffset for efficiency reasons and to check sub-sequent calls of 
fdt_find_compatible_node().

> Looking closer at your cached version, I see you really are using 
> nodeoffset as a persistent variable there and I cringe a bit.  That 
> implementation presumes you ask questions sequentially or always reset 
> to 0.  This would prevent you from, for instance, searching for the 
> "/soc8360 at ..." node using fdt_path_offset() and then looking for devices 
> inside that node.  Given what is being represented and how, perhaps I'm 
> creating an unreasonable strawman.

Yes that's the current behaviour for both versions. I thought a scan 
will always be started with 0. Nevertheless, as I see it, the Linux 
version does not use the hierarchy but just scans all device (nodes) 
sequentially. But this could be changed easily and I was already 
thinking to do so.

> Instead of using a static variable, you could scan the cache table for 
> an element whose stored offset is greater than or equal to startoffset, 
> and search the table forward from there (the offsets in the table will 
> be monotonically increasing because of how you build the table). 
> Slightly less efficient, but it will still be pretty good and much 
> better than scanning the whole blob... or maybe I'm complaining based on 
> an unreasonable strawman.

Slightly less efficient, yes, that was my motivation. In U-Boot we 
access the libfdt always sequentially and therefore the trick with 
static variables for efficiency seems OK to me.

> After writing all the above, it bothers me a bit that the hierarchical 
> tree structure of the device tree is getting stuck into a one 
> dimensional cache (the hierarchy is lost), but I cannot think at this 
> point why that would bite us down the road.

See above. I have to check my assumption, though. Let's  clarify first 
the intended behavior of fdt_find_compatible_node().

Wolfgang.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [U-Boot-Users] LIBFDT: first version of fdt_find_compatible_node
  2007-04-26  7:39       ` Wolfgang Grandegger
@ 2007-04-26  9:42         ` Wolfgang Grandegger
  2007-04-26 10:45           ` Jerry Van Baren
  0 siblings, 1 reply; 8+ messages in thread
From: Wolfgang Grandegger @ 2007-04-26  9:42 UTC (permalink / raw)
  To: u-boot

Hi Jerry,

 >Wolfgang Grandegger wrote:
>> Jerry Van Baren wrote:
[snip]
>> Yes, blob parsing will be done from the start of the blob until an 
>> answer is found every time a question is asked of it.  Not a paradigm of 
>> efficiency. :-/
>>
>> WRT the cached version, I have doubts about how much time it will save 
>> since I expect the "find compatible" will only be used during 
>> initialization.  Is it worth optimizing?  Really slow memory - yes. Fast 
>> memory - I doubt it.
>> a) I don't picture blobs being stored in really slow memory (no i2c 
>> memories).
>> b) If the memory really is slow, it seems like it would be as good or 
>> better to copy the blob to RAM and use it out of RAM (but there may be 
>> chicken & egg problems with that - I don't know how deeply you are 
>> looking to embed this).
>>
>> I don't know what board/processor/memory you are ultimately targeting 
>> with this, so my criticisms may not be valid.  I know denx.de 
>> support(s|ed) some very slow to boot boards that lots of tricks were 
>> done WRT optimization of env variables because they were stored in i2c 
>> memory.
> 
> I'm doing that for a MPC823 at 50 MHz, a very low-end system, and almost 
> to slow for 2.6. I will do some real measurements when time permits to 
> get a better feeling.

Here are the results of some quick measurements on my MPC855 at 80/40 
MHz with the attached code example and my DTS test file:

               from FLASH   from Memory
Non-cached:    11116 us       1703 us
Cached    :     2800 us       6226 us

Well, I think we can drop the cached version even if its 4 times faster, 
as it make life more difficult, especially in case the FDT gets updated.

Wolfgang.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fdt-find-compatible-nodes.c
Type: text/x-csrc
Size: 907 bytes
Desc: not available
Url : http://lists.denx.de/pipermail/u-boot/attachments/20070426/6a8c55ec/attachment.c 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: lite5200.dts
Url: http://lists.denx.de/pipermail/u-boot/attachments/20070426/6a8c55ec/attachment.txt 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [U-Boot-Users] LIBFDT: first version of fdt_find_compatible_node
  2007-04-26  9:42         ` Wolfgang Grandegger
@ 2007-04-26 10:45           ` Jerry Van Baren
  2007-04-26 11:12             ` Wolfgang Grandegger
  0 siblings, 1 reply; 8+ messages in thread
From: Jerry Van Baren @ 2007-04-26 10:45 UTC (permalink / raw)
  To: u-boot

Wolfgang Grandegger wrote:
> Hi Jerry,
> 
>  >Wolfgang Grandegger wrote:
>>> Jerry Van Baren wrote:
> [snip]
>>> Yes, blob parsing will be done from the start of the blob until an 
>>> answer is found every time a question is asked of it.  Not a paradigm 
>>> of efficiency. :-/
>>>
>>> WRT the cached version, I have doubts about how much time it will 
>>> save since I expect the "find compatible" will only be used during 
>>> initialization.  Is it worth optimizing?  Really slow memory - yes. 
>>> Fast memory - I doubt it.
>>> a) I don't picture blobs being stored in really slow memory (no i2c 
>>> memories).
>>> b) If the memory really is slow, it seems like it would be as good or 
>>> better to copy the blob to RAM and use it out of RAM (but there may 
>>> be chicken & egg problems with that - I don't know how deeply you are 
>>> looking to embed this).
>>>
>>> I don't know what board/processor/memory you are ultimately targeting 
>>> with this, so my criticisms may not be valid.  I know denx.de 
>>> support(s|ed) some very slow to boot boards that lots of tricks were 
>>> done WRT optimization of env variables because they were stored in 
>>> i2c memory.
>>
>> I'm doing that for a MPC823 at 50 MHz, a very low-end system, and 
>> almost to slow for 2.6. I will do some real measurements when time 
>> permits to get a better feeling.
> 
> Here are the results of some quick measurements on my MPC855 at 80/40 
> MHz with the attached code example and my DTS test file:
> 
>               from FLASH   from Memory
> Non-cached:    11116 us       1703 us
> Cached    :     2800 us       6226 us
> 
> Well, I think we can drop the cached version even if its 4 times faster, 
> as it make life more difficult, especially in case the FDT gets updated.
> 
> Wolfgang.

Risk & reward.  The reward is pretty substantial if you read directly 
from slow flash, but iffy for faster RAM.

In the "from Memory" column, do you have the numbers switched?  The 
non-cached is shown as being 4x faster than the cached.

To be a fair comparison, the "from Memory" column also needs the time it 
would take to copy the blob from flash to RAM added, yes?  That penalty 
can be bypassed by loading the blob directly into RAM via tftp or the 
copy to RAM time may already be part of the boot process, but it should 
be identified.

gvb

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [U-Boot-Users] LIBFDT: first version of fdt_find_compatible_node
  2007-04-26 10:45           ` Jerry Van Baren
@ 2007-04-26 11:12             ` Wolfgang Grandegger
  0 siblings, 0 replies; 8+ messages in thread
From: Wolfgang Grandegger @ 2007-04-26 11:12 UTC (permalink / raw)
  To: u-boot

Jerry Van Baren wrote:
> Wolfgang Grandegger wrote:
>> Hi Jerry,
>>
>>  >Wolfgang Grandegger wrote:
>>>> Jerry Van Baren wrote:
>> [snip]
>>>> Yes, blob parsing will be done from the start of the blob until an 
>>>> answer is found every time a question is asked of it.  Not a 
>>>> paradigm of efficiency. :-/
>>>>
>>>> WRT the cached version, I have doubts about how much time it will 
>>>> save since I expect the "find compatible" will only be used during 
>>>> initialization.  Is it worth optimizing?  Really slow memory - yes. 
>>>> Fast memory - I doubt it.
>>>> a) I don't picture blobs being stored in really slow memory (no i2c 
>>>> memories).
>>>> b) If the memory really is slow, it seems like it would be as good 
>>>> or better to copy the blob to RAM and use it out of RAM (but there 
>>>> may be chicken & egg problems with that - I don't know how deeply 
>>>> you are looking to embed this).
>>>>
>>>> I don't know what board/processor/memory you are ultimately 
>>>> targeting with this, so my criticisms may not be valid.  I know 
>>>> denx.de support(s|ed) some very slow to boot boards that lots of 
>>>> tricks were done WRT optimization of env variables because they were 
>>>> stored in i2c memory.
>>>
>>> I'm doing that for a MPC823 at 50 MHz, a very low-end system, and 
>>> almost to slow for 2.6. I will do some real measurements when time 
>>> permits to get a better feeling.
>>
>> Here are the results of some quick measurements on my MPC855 at 80/40 
>> MHz with the attached code example and my DTS test file:
>>
>>               from FLASH   from Memory
>> Non-cached:    11116 us       1703 us
>> Cached    :     2800 us       6226 us
>>
>> Well, I think we can drop the cached version even if its 4 times 
>> faster, as it make life more difficult, especially in case the FDT 
>> gets updated.
>>
>> Wolfgang.
> 
> Risk & reward.  The reward is pretty substantial if you read directly 
> from slow flash, but iffy for faster RAM.
> 
> In the "from Memory" column, do you have the numbers switched?  The 
> non-cached is shown as being 4x faster than the cached.

Yes, a cut & paste error, sorry.

> 
> To be a fair comparison, the "from Memory" column also needs the time it 
> would take to copy the blob from flash to RAM added, yes?  That penalty 
> can be bypassed by loading the blob directly into RAM via tftp or the 
> copy to RAM time may already be part of the boot process, but it should 
> be identified.

Here is the revised list:

                 from FLASH   from Memory
   Non-cached:    11116 us       6226 us
   Cached    :     2800 us       1703 us (2096 us)

The number in brackets is _with_ memcpy. The FDT should normally be 
access from memory, I fully agree.

Wolfgang.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2007-04-26 11:12 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-04-25  8:58 [U-Boot-Users] LIBFDT: first version of fdt_find_compatible_node Wolfgang Grandegger
2007-04-25 15:12 ` Jerry Van Baren
2007-04-25 19:37   ` Wolfgang Grandegger
2007-04-25 21:06     ` Jerry Van Baren
2007-04-26  7:39       ` Wolfgang Grandegger
2007-04-26  9:42         ` Wolfgang Grandegger
2007-04-26 10:45           ` Jerry Van Baren
2007-04-26 11:12             ` Wolfgang Grandegger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox