libwebsockets
Lightweight C library for HTML5 websockets
|
Area | Definition |
---|---|
Cmake | LWS_WITH_COMPRESSED_BACKTRACES on by default |
API | ./include/libwebsockets/lws-backtrace.h |
README | ./READMEs/README.lws_backtrace.md |
The lws_backtrace
apis provide a way to collect backtrace addresses into a struct, and an efficient domain-specific compressor to reduce the number of bytes needed to express the backtrace stack.
This information is particularly useful in RTOS type systems to understand heap usage. The information would typically be sent off the embedded device, in logs or it into own stream, and decompressed and processed off the embedded device, converted to source information via addr2line or similar.
It only works with gcc and probably clang at the moment (patches welcome).
This provides helpers on top of lws_backtrace
that are suitable for adapting your heap allocator to create compressed metadata such as the call stack at allocation time
The extra metadata contains information on allocation size, and the backtrace of the code path that originally performed the allocation. Live allocations are also listed on one or more lws_dll2_owner_t that can be walked to dump active allocations along with the responsible code path.
Entries at both ends of the call stack may be invariant and therefore just bloat to store. At the top end of the call stack, the backtrace will show the path through lws_backtrace apis and perhaps other apis. At the bottom end, depending on your system, the backtrace may detail call sequences from the loader that started your application.
For those reasons, the cmake variables LWS_COMPRESSED_BACKTRACES_SNIP_PRE
and LWS_COMPRESSED_BACKTRACES_SNIP_POST
(defaulting to 2 and 1 respectively) may be set to remove invariant, uninteresting call stack information from the top and bottom of the call stack.
An optional, off-by-default implementation is provided for the lws_*alloc() apis, using the alloc_metadata apis to instrument all allocations via lws_*alloc(). This is not so useful as instrumenting the system allocator with alloc_metadata apis, since it only shows lws allocations, but it is a complete example to show how to do it.
Unless your application is totally singlethreaded, when instrumenting a real allocator, care must be taken with
apis which deal with the hidden overallocation and listing allocations, that they are called from a locked critical section that disallows reentry, either an existing one that the allocator already uses, or add a new mutex.
Calling _lws_alloc_metadata_dump()
allows you to walk the current list of allocations from a heap and dump the backtrace responsible for its allocation. You can define your own iterator callback, or use a helper callback that is provided, lws_alloc_metadata_dump_stdout
, which issues the heap metadata in the lws convention base64 format described below.
To simplify triggering dumps, a convention is defined with a 3-character lead-in identifying lines as dumps or backtraces. This kind of approach makes it easy to emit the metadata into logs and fish them out with grep or similar.
lead-in | signifies | Example |
---|---|---|
~m# | Compressed allocator backtrace, eg, emitted into logs | ~m#IF0BmagugNDWgCnkhdAYpQa6wAAV |
~b# | Decoded, uncompressed backtrace line suitable for addr2line | ~b#size: 7520, 0x406651 0x406852 0x406c1b 0x406294 |
Both examples are complete representations of the same 4-level, 64-bit compressed backtrace.
The lws-api-test-backtrace
example (requires LWS_WITH_COMPRESSED_BACKTRACES
to build) decodes the base64 representations with or without the 3-character lead-in, to textual output suitable for addr2line
. Eg
You can use it with addr2line
in this kind of way (you probably want to give -f -p
to addr2line
as well)
There is a shell script ./contrib/heapmap.sh
which takes a screenscrape of a dump's ~m#
log lines and processes them into an allocation size, backtrace, and function names (especially in RELEASE mode, either a function name hint or the source coordinates are available).
The compressed blob has an outer structure designed for prepending, where the information available at recovery is a pointer to the end of it.
This goes behind the reported allocation, the actual allocation is increased to allow for it and we report what the caller asked for by pointing at the end of this. It means eg at free() time, we are told the address just past the end of this and work backwards to find the start of the compressed blob (which is further aligned backwards to ptr boundary to recover the true allocation start).
data | bits | meaning |
---|---|---|
compressed blob | variable, padded to byte boundary | Backtrace and extra info |
compressed length | fixed, 16 | MSB-first 16-bit byte count of compressed blob, including the 16-bit length itself |
lws_dll2_t | fixed, 3 x pointers | linked-list for tracking |
data | bits | meaning |
---|---|---|
stack depth | 5 | Number of backtrace callstack levels present |
Call stack items, one per stack depth | variable | Compressed Instruction Pointer value |
alloc size bits | 6 | Number of bits in alloc size |
alloc size literal | variable | Allocation size |
The goal is to compress 32- or 64-bit backtraces efficiently.
The Call stack items are compressed one of two ways and start with a bit indicating which method was used for this Call stack item.
The literals issue a bit count and then the significant bits
The delta from a previous Call stack item looks like this:
The delta is decoded, and added or subtracted from the earlier reference result to arrive at the correct reconstruction.
The first Call stack item is always a literal.
Backtrace generation in esp-idf requires CONFIG_COMPILER_CXX_EXCEPTIONS
set in sdkconfig.