From mboxrd@z Thu Jan  1 00:00:00 1970
From: Siarhei Siamashka <siarhei.siamashka@gmail.com>
Date: Sat, 17 Jan 2015 05:50:02 +0200
Subject: [U-Boot] [PATCH 0/2] Really complete SPL & u-boot log on all
 consoles
In-Reply-To: <20150114152747.GU10826@bill-the-cat>
References: <1421152210-14441-1-git-send-email-siarhei.siamashka@gmail.com>
	<54B6262B.2070505@redhat.com> <20150114145729.0649352e@i7>
	<20150114152747.GU10826@bill-the-cat>
Message-ID: <20150117055002.4326c76d@i7>
List-Id: <u-boot.lists.denx.de>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
To: u-boot@lists.denx.de

On Wed, 14 Jan 2015 10:27:47 -0500
Tom Rini <trini@ti.com> wrote:

> On Wed, Jan 14, 2015 at 02:57:29PM +0200, Siarhei Siamashka wrote:
> 
> [snip]
> > level of maturity and support. I'm sure that some u-boot developers
> > are also using something (otherwise, what is the point enabling
> > '-fstack-usage' GCC option in the first place?).
> 
> Not perfect, but in doc/README.SPL:
> 1) Build normally
> 2) Perform the following shell command to generate a list of C files used in
> used in
> $ find spl -name '*.su' | sed -e 's:^spl/::' -e 's:[.]su$:.c:' > used-spl.list
> 3) Execute cflow:
> $ cflow --main=board_init_r `cat used-spl.list` 2>&1 | $PAGER
> 
> And then, yeah, manual poking / just knowing that func() is or is not a
> big stack user.

Thanks for pointing to the relevant instructions. It does not look
like cflow can produce the final stack usage number automatically
though.

Meanwhile, I have improved my script to take care of the indirect
calls too. For example, here is a comparison of the results from
the initial prototype
   http://people.freedesktop.org/~siamashka/files/20150114-spl-stackgraph/spl-stackgraph-v2015.01-cubieboard2-fel.png
and the newer indirect calls aware version:
   http://people.freedesktop.org/~siamashka/files/20150116-spl-stackgraph/spl-stackgraph-v2015.01-cubieboard2-fel.png

On the first picture, the serial and i2c functions are in their own
disconnected isles. On the second picture, indirect call sources and
indirect call targets are identified. We are not really interested
in the exact execution flow, but want to have a reasonably accurate
estimation of the upper bound for stack usage. Underestimating stack
usage is bad, but overestimating it a bit is perfectly fine. So we can
just evaluate all the possible permutations of the indirect call
execution paths (regardless of whether they are making any sense)
and pick the one, which results in the maximal stack usage.

The indirect call sources (octagonal shaped nodes) are identified by
basically looking for "b/bl/blx reg" instructions in the objdump log.
The indirect call targets (octagonal shaped boxes with double boundary)
are identified by parsing the relocation tables (if a function is
ever called via a pointer, then this pointer must be stored somewhere
and have an entry in the relocation table). This is rather simple,
but seems to be reasonably reliable and efficient.

As for the dynamically sized arrays in functions. It makes no sense
guessing. Somebody just has to provide an upper bound for these
allocations to the script. Maybe even via a special comment tag in
the source code, which can be parsed automatically?

Assembly functions, which do not have *.su information from GCC, may
cause some difficulties in theory. I have added code to estimate the
stack usage in such functions, based on parsing "push" and
"sub sp, sp, #imm" instructions. But maybe these should be highlighted
with a special color on the callgraph, so that the developer running
the script could pay special attention to them?

The script is also able to detect suspected recursion cases. Currently
u-boot SPL code has "get_current -> puts -> serial_puts -> get_current"
self-destructive recursion. Which is activated by a serial console
failure, which results in a debug message being printed to the serial
console, which in turn causes it to fail again and attempt to print
even more error messages...

In the case of indirect calls, evaluating weird permutations of
caller->callee can result in a lot of complaints from the recursion
detection logic in the script. This can be solved by re-running
the script repeatedly and blacklisting impossible caller->callee
pairs manually until the script is happy.


Anyway, right now I have a fully feature complete prototype of
the (hopefully) accurate SPL stack usage detection tool. Which
may take a few more days before it is cleaned up and ready to
be contributed to u-boot. Just wonder if the use of Python is
required for implementing u-boot tools or a Ruby script can
be accepted too?

-- 
Best regards,
Siarhei Siamashka