avstack.pl

Daniel Beer
dlbeer@gmail.com
31 May 2013

When developing for memory constrained systems, it's useful to know at compile-time that the available memory is sufficient for your application firmware. Dynamic allocation is usually avoided, and the size of statically allocated memory in the .data and .bss sections can be easily inspected with GNU size or a similary tool. This leaves one problem: stack use. The stack grows and shrinks dynamically at runtime, but under some circumstances, the peak stack use can be statically analyzed.

The script linked above is a static stack checker, intended for use with a recent-ish version of AVR-GCC. In order to use it, you must ensure that your object files are compiled with -fstack-usage. This causes GCC to generate a .su file for every .o file. These files contain information on the size of the stack frame for each function compiled in the .o file. For example:

sha1.c:43:13:mix_w      0       static
sha1.c:54:13:rot_left   0       static
sha1.c:69:13:rot_right  0       static
sha1.c:178:6:sha1_init  0       static
sha1.c:183:6:sha1_next  65      static
sha1.c:192:6:sha1_final 6       static

Once you have these files, invoke avstack.pl, passing as arguments the names of all .o files that will be linked into your binary. The .su files are assumed to be located in the same directories as their corresponding .o files.

The script reads all .su files, and disassembles all .o files, including relocation data. The disassemblies are parsed and used to construct a call graph. Multiple functions in different translation units with the same name don't cause problems, provided there are no global namespace collisions. Information will appear on any unresolvable or ambiguous references.

Next, the call graph is traced to find the peak stack usage of all functions. This is calculated for each function as the maximum stack usage of any of its callees, plus its own stack frame, plus some call-cost constant (not included in GCC's analysis).

Here's an example of the output produced:

$ ./avstack.pl */*.o
  Func                               Cost    Frame  Height
------------------------------------------------------------------------
> main                                235       90        4
  prng_next                           145       72        3
  sha1_final                           83       10        3
  sha1_next                            73       69        2
  __vector_16                          28       20        3
> INTERRUPT                            28        0        4
  clock_signal_tick                     8        4        2
  clock_iterate                         8        4        2
  clock_init                            4        4        1
  led_init                              4        4        1
  led_set                               4        4        1
> led_get                               4        4        1
  rot_right                             4        4        1
  mix_w                                 4        4        1
  prng_init                             4        4        1
  clock_signal_pps                      4        4        1
  clock_get_raw                         4        4        1
  clock_get                             4        4        1
  prng_stir                             4        4        1
  rot_left                              4        4        1
  sha1_init                             4        4        1
  led_slot_clock_tick                   4        4        1

The following functions were not resolved:
  memcpy_P

The columns are:

Indicators to the left of the function name indicate features of the call graph. A > indicates that the function has no callee. This could be because it's an entry point (like main or an interrupt vector), because the function is called dynamically through a function pointer, or simply that it's unused. An R indicates that the function is recursive, and the cost estimate is for a single level of recursion. Multiple functions with the same name are distinguished in the listing by appending a suffix of the form @filename.o.

You can customize this script by altering two variables near the beginning:

my $objdump = "avr-objdump";
my $call_cost = 4;

Note that making sense of the output of the stack analysis still requires you to know something about how your program runs. For example, in many programs, the actual peak stack use would be the cost of main, plus the maximum cost of any interrupt handler that might execute. You will also need to take into account dynamically invoked functions, if you have any.

To make things easier, there is a processing section in which fake nodes and edges can be added to the call graph to account for dynamic behaviour. For example, the script currently contains in this section, to represent interrupt execution:

$call_graph{"INTERRUPT"} = {};

foreach (keys %call_graph) {
    $call_graph{"INTERRUPT"}->{$_} = 1 if /^__vector_/;
}

Copyright © 2013 Daniel Beer <>

Permission to use, copy, modify, and/or distribute this software for any purpose with or without fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies.

THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.