From mboxrd@z Thu Jan 1 00:00:00 1970 From: Siarhei Siamashka Date: Sat, 17 Jan 2015 05:50:02 +0200 Subject: [U-Boot] [PATCH 0/2] Really complete SPL & u-boot log on all consoles In-Reply-To: <20150114152747.GU10826@bill-the-cat> References: <1421152210-14441-1-git-send-email-siarhei.siamashka@gmail.com> <54B6262B.2070505@redhat.com> <20150114145729.0649352e@i7> <20150114152747.GU10826@bill-the-cat> Message-ID: <20150117055002.4326c76d@i7> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: u-boot@lists.denx.de On Wed, 14 Jan 2015 10:27:47 -0500 Tom Rini wrote: > On Wed, Jan 14, 2015 at 02:57:29PM +0200, Siarhei Siamashka wrote: > > [snip] > > level of maturity and support. I'm sure that some u-boot developers > > are also using something (otherwise, what is the point enabling > > '-fstack-usage' GCC option in the first place?). > > Not perfect, but in doc/README.SPL: > 1) Build normally > 2) Perform the following shell command to generate a list of C files used in > used in > $ find spl -name '*.su' | sed -e 's:^spl/::' -e 's:[.]su$:.c:' > used-spl.list > 3) Execute cflow: > $ cflow --main=board_init_r `cat used-spl.list` 2>&1 | $PAGER > > And then, yeah, manual poking / just knowing that func() is or is not a > big stack user. Thanks for pointing to the relevant instructions. It does not look like cflow can produce the final stack usage number automatically though. Meanwhile, I have improved my script to take care of the indirect calls too. For example, here is a comparison of the results from the initial prototype http://people.freedesktop.org/~siamashka/files/20150114-spl-stackgraph/spl-stackgraph-v2015.01-cubieboard2-fel.png and the newer indirect calls aware version: http://people.freedesktop.org/~siamashka/files/20150116-spl-stackgraph/spl-stackgraph-v2015.01-cubieboard2-fel.png On the first picture, the serial and i2c functions are in their own disconnected isles. On the second picture, indirect call sources and indirect call targets are identified. We are not really interested in the exact execution flow, but want to have a reasonably accurate estimation of the upper bound for stack usage. Underestimating stack usage is bad, but overestimating it a bit is perfectly fine. So we can just evaluate all the possible permutations of the indirect call execution paths (regardless of whether they are making any sense) and pick the one, which results in the maximal stack usage. The indirect call sources (octagonal shaped nodes) are identified by basically looking for "b/bl/blx reg" instructions in the objdump log. The indirect call targets (octagonal shaped boxes with double boundary) are identified by parsing the relocation tables (if a function is ever called via a pointer, then this pointer must be stored somewhere and have an entry in the relocation table). This is rather simple, but seems to be reasonably reliable and efficient. As for the dynamically sized arrays in functions. It makes no sense guessing. Somebody just has to provide an upper bound for these allocations to the script. Maybe even via a special comment tag in the source code, which can be parsed automatically? Assembly functions, which do not have *.su information from GCC, may cause some difficulties in theory. I have added code to estimate the stack usage in such functions, based on parsing "push" and "sub sp, sp, #imm" instructions. But maybe these should be highlighted with a special color on the callgraph, so that the developer running the script could pay special attention to them? The script is also able to detect suspected recursion cases. Currently u-boot SPL code has "get_current -> puts -> serial_puts -> get_current" self-destructive recursion. Which is activated by a serial console failure, which results in a debug message being printed to the serial console, which in turn causes it to fail again and attempt to print even more error messages... In the case of indirect calls, evaluating weird permutations of caller->callee can result in a lot of complaints from the recursion detection logic in the script. This can be solved by re-running the script repeatedly and blacklisting impossible caller->callee pairs manually until the script is happy. Anyway, right now I have a fully feature complete prototype of the (hopefully) accurate SPL stack usage detection tool. Which may take a few more days before it is cleaned up and ready to be contributed to u-boot. Just wonder if the use of Python is required for implementing u-boot tools or a Ruby script can be accepted too? -- Best regards, Siarhei Siamashka