Skip to content

Conversation

@emeryberger
Copy link
Contributor

DWARF 5 Rangelist Support for libelfin

This change adds support for DWARF 5 range lists (DW_FORM::rnglistx with .debug_rnglists section) to libelfin.

Problem

DWARF 5 binaries use a new format for range lists that differs from DWARF 4:

  • DWARF 4: Uses DW_FORM::sec_offset pointing directly into .debug_ranges
  • DWARF 5: Uses DW_FORM::rnglistx as an index into .debug_rnglists, with entries encoded using DW_RLE_* opcodes

This caused a value_type_mismatch exception when attempting to parse DWARF 5 range lists:

cannot read value::type::rangelist as sec_offset

Changes

dwarf/data.hh

Added DW_RLE enum for DWARF 5 range list entry encodings (Section 7.25):

enum class DW_RLE : ubyte
{
    end_of_list    = 0x00,
    base_addressx  = 0x01,
    startx_endx    = 0x02,
    startx_length  = 0x03,
    offset_pair    = 0x04,
    base_address   = 0x05,
    start_end      = 0x06,
    start_length   = 0x07,
};

dwarf/dwarf++.hh

  • Added rnglists to section_type enum
  • Added is_dwarf5 parameter to rangelist constructor and rangelist::iterator

dwarf/elf.cc

Added section name mapping:

{".debug_rnglists", section_type::rnglists},

dwarf/value.cc

Updated as_rangelist() to handle DW_FORM::rnglistx:

  • Reads the ULEB128 index from the attribute
  • Parses the .debug_rnglists header (unit_length, version, addr_size, segment_selector_size, offset_entry_count)
  • Looks up the offset in the offsets table
  • Returns a rangelist configured for DWARF 5 parsing

dwarf/rangelist.cc

Updated rangelist::iterator::operator++() to support both formats:

  • DWARF 4: Pairs of addresses with special base address entries
  • DWARF 5: DW_RLE_* encoded entries including:
    • DW_RLE_end_of_list - terminates the list
    • DW_RLE_offset_pair - two ULEB128 offsets from base
    • DW_RLE_base_address - sets new base address
    • DW_RLE_start_end - two full addresses
    • DW_RLE_start_length - address + ULEB128 length

dwarf/to_string.cc

Added to_string(DW_RLE v) for debug output.

Limitations

The following DW_RLE_* entries that require .debug_addr lookups are parsed but skipped:

  • DW_RLE_base_addressx
  • DW_RLE_startx_endx
  • DW_RLE_startx_length

Full support would require passing the compilation unit's .debug_addr base to the rangelist iterator.

Testing

Verified with a DWARF 5 binary compiled with g++ -gdwarf-5. The profiler now successfully parses range lists and generates valid profiles.

emeryberger and others added 14 commits March 8, 2024 15:55
Found by compiling with clang++ instead of g++.
Remove incorrect use of const
This code uses `front()` to get the underlying string buffer. However,
when glibc++ assertions are enabled, this causes an assert failure if
the string is empty. Since we have no need to perform the memmove if the
string is empty, we can fix the crash by simply guarding with the
condition `size > 0`.

Note that this assert failure only occurred with glibc++ assertions
enabled. Because Arch apparently enables them by default (while other
distros don't), it initially appeared to be an Arch-specific problem.
However, it's just that it only surfaced on Arch, while having the
potential for issues on other platforms.
Guard usage of `std::string::front` for empty string
These files are normally generated by enum-print.py during Makefile builds.
Including them in the repo enables CMake-based builds (like FetchContent)
to work without running the Python generator.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This reverts commit b88a0c6, reversing
changes made to ca22688.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants