DWARF debug info
kit's debug-info subsystem turns frontend type/variable/line information into
DWARF 5 inside an object file (the producer), and reads DWARF back out of an
object file to answer source-level queries (the consumer). Both halves live
under src/debug/, but they share no state types — the on-disk DWARF wire
format is the only contract between them. The producer is driven into by the
code generator; the consumer is a standalone reader over an already-parsed
KitObjFile. This split is what lets kit -g emit DWARF that kit addr2line, kit objdump --dwarf, and the dbg debugger consume through the
same public API.
frontend ──► CG API ──► Debug (producer) ─emit─► .debug_* sections
(lang/c) (session) src/debug/debug*.c in KitObjFile
│
▼
kit_dwarf_* (consumer) ◄─open─ src/debug/dwarf_*.c
addr2line / objdump / dbg / emu
DWARF version is 5 only, 32-bit DWARF (DWARF32) length form. The
consumer tolerates and skips DWARF64 and pre-5 units rather than decoding them.
The CFI/.eh_frame half of unwinding is produced elsewhere — by the
MCEmitter, not by Debug — but consumed here; see §3.
See OBJ.md for the section/symbol/relocation substrate, CODEGEN.md for the CG API that drives the producer, LINK.md for how debug sections survive linking and JIT view-merging, and DBG.md/EMU.md for the debugger and emulator that consume the reader.
1. The SourceManager: the file-id authority
src/core/source.c owns the mapping from a small integer file_id to a source
file's name/path/kind. It is the single authority shared across diagnostics,
dependency (-M) output, and DWARF. A SrcLoc is (file_id, line, col);
file_id == 0 is reserved as the null/invalid slot (so source_new seeds slot
0 empty and real files start at 1), and source_file() returns NULL for it.
Files enter via source_add_file (a real on-disk path), source_add_memory
(an in-memory unit, used heavily by tests so paths are stable across runs), or
source_add_builtin. The manager also records #include edges
(source_add_include) which feed dependency generation, and macro-expansion
pseudo-files. The DWARF producer never invents its own file numbering: it asks
the SourceManager for the path behind a SrcLoc.file_id and assigns its own
dense DWARF file index on top (see §2.4). This keeps every file:line the
compiler reports — in an error message, in a .d file, and in .debug_line —
referring to the same underlying file identity.
2. Producer architecture
The producer is the Debug object. Its public surface is src/debug/debug.h;
its state and the wire-format serializer are private to src/debug/.
2.1 Who creates and drives Debug
Debug is driven into, never out of. src/debug/ includes core and
obj but not src/cg/ or src/arch/; the reverse direction (CG using Debug)
is fine. The flow:
- The backend's
make()creates the producer whenopts->debug_infois set, viacg_mc_debug_new(src/arch/cgtarget.c), gated byKIT_DWARF_ENABLED. The sameDebug*is handed to theMCEmitter(for line rows) and captured by the CG session before any optimizer wrapper. - The CG session (
src/cg/session.c) drives function/scope/variable lifecycle from inside the public CG API entry points, and callsdebug_emitthendebug_freeatkit_cg_finish. - The C-type → DWARF-type adapter lives at
src/cg/debug.c(api_debug_type), not in the language frontend: it lowers a CG type id (KitCgTypeId) into a chain ofdebug_type_*calls. Debug itself is language-neutral — it knows DWARF type kinds (base, ptr, array, qualified, typedef, func, record, enum) but nothing about C.
Events split by who owns the information:
| Event | Driver | Producer call |
|---|---|---|
| function begin/end, return type | CG session | debug_func_begin/debug_func_end |
| params, locals, their storage | CG session at func_end |
debug_param/debug_local |
| current source location | CG session on set_loc |
debug_set_pending_loc |
| line rows (offset ↔ loc) | backend per instruction | debug_emit_row |
| function PC bounds | backend at finalize | debug_func_pc_range |
| types | api_debug_type adapter |
debug_type_* |
The two-sided event is the line program: only the parser/CG side knows the
SrcLoc, and only the backend knows the byte offset of an emitted instruction.
So the CG session stashes the latest loc with debug_set_pending_loc, and each
backend instruction emitter calls debug_emit_row(debug, section, offset, loc)
after writing bytes (see the dense if (mc->debug) debug_emit_row(...) calls
in src/arch/*/emit.c). debug_line dedupes a row whose (section, offset, loc) equals the previous one, so a multi-instruction CG op that re-reports the
same loc costs nothing. Rows are accumulated per-function in emit order, so no
sort pass is needed.
debug_func_pc_range records (text_section, begin_ofs, end_ofs) against the
currently-open function; function size is end - begin. debug_prune_removed_funcs
drops functions whose symbol the object layer later marked removed (e.g. an
inlined-away or dead function), so stale DIEs and line rows don't ship.
2.2 Producer state shape
Debug (src/debug/debug_internal.h) holds:
- a file table — dense
DebugFile[]plus asrc_file_id → dwarf_idxmap; each entry caches the path split into interneddir/basesymbols; - a type DIE pool —
DebugType[]indexed byDebugTypeId(1-based;DEBUG_TYPE_NONE == 0). Construction is one-shot per call: building the same shape twice yields two ids.voidis the one interned singleton. Records and enums are built incrementally through opaque builder handles (debug_type_record_begin/_field/_end) so recursive shapes resolve via the adapter's own cache; - a function table —
DebugFunc[], each carrying its symbol, function-type id, decl loc, PC range, a flatDebugVarDIE[]of params+locals, aDebugScope[]tree built fromscope_begin/scope_endpairs (with an open-scope stack while building), and its function-localLineRow[]; - a loclist table — storage for time-varying variable locations
(registered via
debug_loclist_new/_add), wired but not yet serialized.
Variable location is a tagged DebugVarLoc: DVL_FRAME (frame offset),
DVL_REG (DWARF register number), DVL_GLOBAL (symbol), or DVL_LOCLIST.
2.3 Serialization: debug_emit
debug_emit (src/debug/debug_emit.c) linearizes everything into .debug_*
sections in one pass over an EmitCtx. The helper layers are
src/debug/debug_abbrev.c (abbreviation pool, dedup by
(tag, has_children, attr-list), 1-based codes assigned in first-use order) and
src/debug/debug_form.c (LEB128 and fixed-width form byte encoders that write
into a Buf, independent of any live ObjBuilder section).
Sections emitted: .debug_abbrev, .debug_info, .debug_line,
.debug_line_str, .debug_str, .debug_str_offsets, .debug_aranges,
.debug_rnglists. All eight section ids — plus a paired SK_SECTION symbol per
section — are created up front, before any payload, because cross-section
references are emitted as relocations that must name their target symbol.
Key wire-format choices, all centralized in resolve_abbrevs and the emit
helpers:
- Strings are interned into
.debug_str, referenced from.debug_infouniformly asDW_FORM_strx4indices through the.debug_str_offsetstable (the CU root carriesDW_AT_str_offsets_base). Line-program file/dir paths live in the separate.debug_line_strand are referenced byDW_FORM_line_strp. - Intra-CU DIE references (e.g.
DW_AT_type) areDW_FORM_ref4, unit-relative. They are forward-resolved: the type's body offset is unknown at reference time, so a fixup list is recorded and patched after all DIEs are laid out (cu_relative = cu_header_size + target_body_offset). A reference tovoid(which has no DIE) is written as 0; the consumer reads 0 as void. - Addresses use relocations, never literal values.
DW_AT_low_pc(DW_FORM_addr) and the line program'sDW_LNE_set_addressemit anR_ABS64against the function symbol.DW_AT_high_pcisDW_FORM_data4holding the function size (offset form), so it needs no reloc. - Address size is a single value, the target's pointer width
(
c->target.ptr_size), threaded everywhere an address-sized field appears: the CU headeraddress_sizebyte, the line-program header,DW_FORM_addrslots, theDW_OP_addroperand width, and the.debug_aranges/.debug_rnglistspayload widths (aranges also aligns each tuple to2 * address_sizefrom the section start). The producer never hardcodes 8; cross-targeting a 32-bit pointer target would narrow these fields uniformly. - Cross-section offsets (
DW_AT_stmt_list,DW_AT_ranges,DW_AT_str_offsets_base,debug_abbrev_offset,.debug_str_offsetsentries,.debug_lineline_strpslots, the arangesdebug_info_offset) are written with the correct literal value and paired with anR_ABS32reloc against the target section's symbol. In a plain.othe debug sections are not laid out (section vaddr 0), so the reloc is a no-op and the literal stands; under the JIT view-builder (src/link/link_jit.c), where several inputs' debug sections are concatenated, the reloc rebases each offset to its slot in the merged view. This dual-write is the single trick that makes multi-input DWARF (one CU per input) resolve correctly in-process.
The CU root DIE carries DW_AT_producer, DW_AT_language (DW_LANG_C11),
DW_AT_name/DW_AT_comp_dir (the primary file's base/dir, seeded from the
first function's decl site if no file has been referenced yet), DW_AT_stmt_list,
DW_AT_ranges, and DW_AT_str_offsets_base. The subprogram DIE uses a single
abbrev with DW_AT_type always present (a void return writes ref4=0). When a
function has no source-level params (e.g. unoptimized prototype-only info), the
emitter falls back to synthesizing DW_TAG_formal_parameter children from the
function type's parameter types so the signature is still recoverable.
DW_AT_frame_base on every subprogram is the one-byte exprloc
{ DW_OP_call_frame_cfa }. Variable locations become exprlocs per
DebugVarLoc: DW_OP_regN/DW_OP_regx for registers, DW_OP_fbreg <sleb> for
frame offsets, DW_OP_addr for globals.
2.4 Line program
The line program (emit_section_line) is built program-first (so its byte
length is known before the header). DWARF 5 conventions: file 0 is the CU
primary file; directory_entry_format/file_name_entry_format are fixed
(DW_LNCT_path as line_strp, plus DW_LNCT_directory_index for files);
directories are deduped. minimum_instruction_length and
maximum_operations_per_instruction come from the arch's ArchDwarfOps
(src/arch/arch.h) — fixed-width ISAs use their instruction width, x86-64 uses
1 because PC advances are byte-granular.
Per function with a PC range: emit DW_LNE_set_address (relocated against the
function symbol), then for each row advance file/column/PC/line with standard
opcodes (DW_LNS_set_file, set_column, advance_pc/fixed_advance_pc,
advance_line, copy), then advance to the function end and emit
DW_LNE_end_sequence. No special opcodes or extension opcodes are produced;
the encoding stays simple and re-decodable.
.debug_aranges (a (low_pc, length) per function, kept for fast attach) and
.debug_rnglists (one DW_RLE_start_length per function) round out the
address-coverage indexes.
3. CFI / .eh_frame — produced outside Debug
Unwind info is not emitted by the Debug producer. The .eh_frame section is
synthesized by the MCEmitter (mc_emit_eh_frame in src/arch/mc.c), driven by
per-arch CFI directives (cfi_startproc/cfi_def_cfa/cfi_offset/cfi_endproc)
that the native backends call around their prologues (e.g.
src/arch/aa64/native.c). The MCEmitter buffers a per-function FDE of CFI
directives, each tagged with a post-prologue PC offset, and assembles one CIE +
one FDE-per-function at TU finalize. This lives in the codegen path because only
the backend knows the exact prologue shape and PC offsets.
The DWARF consumer (dwarf_cfi.c) reads this .eh_frame for unwinding, so the
two ends still meet at the wire format — just on the codegen side, not the Debug
side. See CODEGEN.md/ARCH.md for the producer.
4. Consumer architecture
The consumer is KitDebugInfo, opened from a KitObjFile and answering the
kit_dwarf_* queries declared in include/kit/dwarf.h. It is split by
concern into one file per stage, sharing the private dwarf_internal.h. The
reader never re-decodes the object format: it asks the obj layer for section
bytes by name and treats them as its substrate. Most state is built lazily on
the first query that needs it.
open/abbrev dwarf_open.c sections, byte primitives, abbrev cache,
CU headers, form decoding, DIE iteration
│
DIE walk dwarf_die.c subprograms, lexical blocks, params/locals,
globals — attribute packing
│
line prog dwarf_line.c decode .debug_line → row matrix; addr↔line
│
CFI dwarf_cfi.c .eh_frame machine; kit_dwarf_unwind_step
│
loc/type dwarf_loc.c DWARF stack machine; loclist resolution
dwarf_type.c type DIE → KitDwarfType (cached)
│
query dwarf_query.c subprogram_at/named, var_at, vars/param iters,
loc_read
dump dwarf_dump.c structural iterators for objdump --dwarf
4.1 Open, sections, primitives (dwarf_open.c)
kit_dwarf_open looks up debug sections by name. It is format-aware in only
one place: dw_find_section also tries the Mach-O spelling
(__DWARF,__debug_*, 16-char truncated, and __TEXT,__eh_frame) so one lookup
spans ELF and Mach-O. The mandatory five are .debug_abbrev, .debug_info,
.debug_line, .debug_str, .debug_line_str; if any is missing, open fails
with KIT_NOT_FOUND. .debug_str_offsets, .debug_addr, .debug_loclists,
.debug_rnglists, .debug_aranges, and .eh_frame are optional.
This file also holds the bounds-checked byte-stream primitives (dw_u8/u16/
u24/u32/u64/uleb/sleb/cstr), the abbrev-table parser and cache
(keyed by abbrev-section offset, shared across CUs that point at the same
table), the CU-header parser (which records each CU's address_size), the form
decoder (dw_read_form, which resolves strx/strp/line_strp to strings
inline and sizes DW_FORM_addr by the CU's address_size rather than assuming
8), and the generic DIE reader/skipper.
On truncated input the primitives clamp and return zero rather than crash. All
CUs are parsed eagerly into d->cus (dw_parse_all_cus, idempotent); each CU's
root DIE is scanned for the base attributes (str_offsets_base, addr_base,
stmt_list, name, comp_dir) in two passes so that strx resolution has its
base before any string attribute is read.
4.2 DIE walk (dwarf_die.c)
A recursive walker keyed off the abbrev table, run lazily and cached. It does
not expose a general "iterate every DIE" surface to queries (that's
dwarf_dump.c); instead it collects exactly what the query layer needs:
dw_build_subs— everyDW_TAG_subprogram/DW_TAG_inlined_subroutine, indexed by PC range.decl_fileis resolved through the CU's line-program file table.dw_build_locals— for one subprogram, walks its subtree forDW_TAG_formal_parameter/DW_TAG_variable, threading lexical-block PC ranges down so each local carries the[scope_lo, scope_hi)it is live in.dw_build_globals— top-levelDW_TAG_variableDIEs under each CU root.
Attribute reads funnel through read_pack/DieAttrPack, a flat struct that
captures the attributes any consumer cares about (name, low/high pc, type
offset, decl file/line, location block or loclist index, frame base, member
offset, byte/bit size, encoding, array count). DW_AT_type is normalized to an
absolute .debug_info offset (ref* forms are CU-relative; ref_addr is
absolute).
4.3 Line program decoder (dwarf_line.c)
dw_build_line runs the DWARF 5 line-number state machine for one CU's
stmt_list, materializing a DwLineRow[] row matrix. It parses the v5
directory/file entry formats, then composes a normalized absolute path per file
index (file_norm: dir + '/' + path, or the path as-is if already absolute) for
byte-equal matching. DWARF64 and non-5 versions are skipped.
kit_dwarf_addr_to_line finds the row covering a PC. The subtlety, encoded in
the loop, is sequence boundaries: a row covers [row.addr, next_row.addr), and
an end_sequence row closes a sequence rather than covering anything. Without
honoring that, in a multi-CU image (one CU per linked input, abutting in a
single .text) an earlier CU would swallow addresses belonging to a later one.
kit_dwarf_line_to_addr does the reverse, matching the user's file either
exactly or as a /-anchored suffix (so util.c:42 resolves against an absolute
file_norm); it returns KIT_AMBIGUOUS when distinct file paths match, with
kit_dwarf_line_to_addr_all to enumerate candidates.
4.4 Location evaluator and loclists (dwarf_loc.c)
dw_eval_expr is a small DWARF stack machine over a fixed 64-slot stack. It
supports the ops the producer emits plus enough arithmetic for composite forms:
DW_OP_litN/regN/bregN, addr, the constNu/s family, dup/drop,
and/or/xor/plus/minus/mul/shl/shr/shra/plus_uconst, regx,
bregx, fbreg, call_frame_cfa, and stack_value. DW_OP_fbreg recursively
evaluates the subprogram's DW_AT_frame_base (which is DW_OP_call_frame_cfa,
so the caller's frame->cfa supplies the base). The result is tagged as a
memory address, a register number, or an immediate ("stack value").
dw_loclist_resolve walks a .debug_loclists entry for a DW_FORM_loclistx
index and returns the location expression active at a PC. It handles
offset_pair, start_end, start_length, default_location, and
base_address; the .debug_addr-indirected variants are recognized and skipped.
4.5 Type resolution (dwarf_type.c)
dw_type_from_die builds a KitDwarfType on demand from a DIE offset, cached
by offset. It interns the node before recursing into inner/field/element
types, which breaks cycles (a struct containing a pointer to itself). Qualifier
types (const/volatile/restrict) are modeled as transparent wrappers; the
public kit_dwarf_type_info and the field/enum iterators look through typedef
and qualifier layers to the underlying aggregate. Base-type encoding is mapped
to a small public kind enum (bool/sint/uint/float/char).
4.6 CFI unwinder (dwarf_cfi.c)
kit_dwarf_unwind_step sweeps .eh_frame, finds the FDE whose
(initial_location, range) covers frame->pc, runs the CIE initial
instructions then the FDE program up to pc, and computes the caller frame. It
handles the common CFA opcodes (def_cfa/def_cfa_register/def_cfa_offset
and their _sf forms, advance_loc*, offset*, register, undefined,
same_value) and the zR-augmentation FDE pointer encodings. It mutates
frame->cfa and frame->pc; the return address comes from a register rule or
is treated as stack-bottom (KIT_NOT_FOUND) when undefined. Recovering
arbitrary callee-saved registers would require CFA-relative memory loads, which
this step does not perform — the debugger supplies a memory provider for variable
reads (loc_read), but the unwinder leaves register slots as-is.
4.7 Query and dump surfaces
dwarf_query.c is the PC/name-keyed public API: kit_dwarf_subprogram_at
/_named, the thin kit_dwarf_func_at, kit_dwarf_var_at (deepest-scope
first, then params, then globals), the vars_at / param_iter iterators, and
kit_dwarf_loc_read. loc_read is where the consumer reaches outside DWARF:
it takes a KitUnwindFrame (registers + CFA) and a caller-supplied
KitDwarfReadMemFn, so register locations resolve from the frame and
frame/global/expr locations resolve through the memory callback. The debugger
backs that callback with the JIT session's memory reader (driver/cmd/dbg.c).
dwarf_dump.c is the structural-enumeration API for dumpers (objdump --dwarf): CU, DIE (depth-first across all CUs), DIE-attribute, abbrev,
abbrev-attribute, line-row, and .debug_str iterators. These hand back raw
numeric DWARF codes and form classes; symbolic rendering is the dumper's job.
They are thin cursors over the same lazily-built state the query layer uses.
5. Clients
kit addr2line(driver/cmd/addr2line.c) is the smallest consumer client: open the object,kit_dwarf_open, then per address callkit_dwarf_addr_to_line(andkit_dwarf_func_atfor-f). It supports-e/-a/-f/-p/--basenamesand reads addresses from argv or stdin.kit objdump --dwarfdrives thedwarf_dump.cstructural iterators.dbg(driver/cmd/dbg.c,src/dbg/step.c) useskit_dwarf_subprogram_atfor frame naming,kit_dwarf_unwind_stepfor backtraces, andkit_dwarf_var_at+kit_dwarf_loc_readforp name. See DBG.md.emulifts guest code as a parser-shaped client of the same reader; see EMU.md.
Nothing past these entry points reaches into the reader internals — the
kit_dwarf_* API is the whole contract, which is what lets the producer and
consumer be tested against each other purely through emitted bytes.
Planned work: see doc/plan/DEBUG.md.