kit

kit
git clone https://git.ryansepassi.com/git/kit.git
Log | Files | Refs | README

commit 4e486a6b390dd05525986f195c8d2640a5f9081e
parent ba3703764da428d800257c7ce5b760386a9ed5e7
Author: Ryan Sepassi <rsepassi@gmail.com>
Date:   Mon, 11 May 2026 13:07:45 -0700

dbg: multi-input cfree_jit_view + DWARF suffix-match / degradation

Concatenate per-input .debug_* sections in the view via section-symbol
relocations. The debug emitter now emits R_ABS32 against SK_SECTION
syms for every cross-section offset (CU header debug_abbrev_offset;
root-DIE stmt_list/ranges/str_offsets_base; .debug_str_offsets entries;
.debug_line line_strp slots). On-disk literal is preserved so bare-.o
readers stay byte-identical. link_jit walks every dbg input, snapshots
per-section view-prefix at each input boundary, and resolves SK_SECTION
targets against that snapshot — concatenated CUs land their references
in the right slot. b file:line, list file:line, info functions etc.
now work across every input in the JIT image.

cfree_dwarf_line_to_addr matches by exact path or `/`-suffix and
returns 0/1/2/3 (ok / file not covered / no row at line / ambiguous).
cfree_dwarf_line_to_addr_all enumerates distinct file_norm candidates
so the REPL can prompt for disambiguation. cfree_dwarf_addr_to_line
and cfree_dwarf_var_at return distinct code 2 when a PC sits outside
every CU's coverage; bt / p NAME format that as "no debug info for
this frame".

New `list FILE:LINE` REPL command prints a 5-line context window around
the target via env.file_io.read_all; falls back to "[source not
available; pc=0xADDR]" when the file isn't on disk (e.g. DWARF from a
.o/.a whose source isn't local).

Diffstat:
Mdoc/DBG.md | 34+++++++++++++++-------------------
Mdoc/JIT.md | 14++++++++++----
Mdriver/dbg.c | 252+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++------------
Minclude/cfree.h | 43+++++++++++++++++++++++++++++++++++++------
Msrc/debug/debug_emit.c | 160++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---------
Msrc/dwarf/dwarf_line.c | 126++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----
Msrc/dwarf/dwarf_query.c | 10+++++++++-
Msrc/link/link_jit.c | 215+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++------------------
8 files changed, 713 insertions(+), 141 deletions(-)

diff --git a/doc/DBG.md b/doc/DBG.md @@ -409,16 +409,12 @@ the box. `LinkImage->syms`, surface FUNC / OBJ / COMMON / TLS / IFUNC / ABS; names go through `obj_format_demangle_c` so Mach-O's leading `_` is stripped -- [ ] `cfree_jit_view(jit)` — **blocks all DWARF-dependent REPL - features** (`bt`, typed `p NAME`, `b file:line`, `info - locals/args`, source-level resume modes). Either retain input - `CfreeObjBuilder`(s) on `CfreeJit` with post-link debug-section - relocations applied, or emit a fresh `CfreeBytesInput` after - linking and re-open. PC translation from image-relative to - runtime addresses needed on the DWARF side too. Must span - every input in the link — pipeline-compiled sources *and* - prebuilt `.o` / `.a` debug sections — not just the pipeline's - outputs (see §10). +- [x] `cfree_jit_view(jit)` — multi-input concatenation. Debug emitter + writes R_ABS32 relocs against SK_SECTION symbols for every + cross-section offset; view-builder walks every dbg input, + snapshots per-section prefix sizes, and resolves SK_SECTION + relocs against that snapshot. Externally produced `.o` /`.a` + debug input rides the same reloc-application path. ### Public arch-register API @@ -454,15 +450,15 @@ consumer still assumes one debug-info-bearing compile unit per lookup. The three items below land after `cfree_jit_view` so they can be tested against real debug info; see §10 for the failure modes each addresses. -- [ ] `cfree_dwarf_line_to_addr`: define and implement a basename - disambiguation rule when two inputs share a filename. Either - require a unique path suffix or surface the candidate list as an - error; current behavior is "first match wins" by walking - compile-unit order. -- [ ] DWARF lookups return "no data" (not failure) when a PC lands in - an input compiled without `-g`. The REPL formats `bt` / - `info locals` / `p NAME` accordingly; `b file:line` errors with a - file-not-covered message instead of a generic miss. +- [x] `cfree_dwarf_line_to_addr`: exact-or-path-suffix match. Returns + 0/1/2/3 (ok / file not covered / no row at line / ambiguous). + `cfree_dwarf_line_to_addr_all` enumerates candidates so the REPL + can prompt the user with a longer suffix on collision. +- [x] DWARF lookups return distinct "no data" codes (2) when a PC sits + outside every CU's coverage / a file is uncovered. REPL formats + `bt` as `[no debug info for this frame]`, `p NAME` as `"no debug + info for this frame; '%s' not in global symbols"`, and + `b file:line` as `"file not covered by debug info: %s"`. - [ ] `driver/dbg.c` source listing (`list file:line`) reads from disk via `env.file_io`; when the input is from a `.o` / `.a` debug section whose source file isn't accessible, show the DWARF line diff --git a/doc/JIT.md b/doc/JIT.md @@ -89,10 +89,16 @@ inputs. wired up. Remaining items are listed in `doc/DBG.md` §12; the JIT-facing ones to keep an eye on: -- [ ] `cfree_jit_view` — multi-input handling. v1 returns NULL when more - than one `CfreeObjBuilder` was linked (the cross-CU offset - adjustment for concatenated debug sections is not done). - `src/link/link_jit.c:490`. +- [x] `cfree_jit_view` — multi-input handling. The debug emitter now + emits R_ABS32 relocs against SK_SECTION symbols for every + cross-section offset (CU header `debug_abbrev_offset`, root-DIE + `stmt_list` / `ranges` / `str_offsets_base`, `.debug_str_offsets` + entries, `.debug_line` `line_strp` slots). The view-builder in + `link_jit.c` walks every dbg input, snapshots per-section prefix + sizes, and resolves SK_SECTION relocs against the snapshot — + concatenated CUs land their cross-section offsets in the right + slot. Externally produced `.o` debug info routes through the + same path. - [ ] Windows host adapter for the JIT debugger (vectored exception handlers + `SetThreadContext` instead of POSIX signals). `doc/DBG.md` §host-adapter. diff --git a/driver/dbg.c b/driver/dbg.c @@ -105,6 +105,7 @@ void driver_help_dbg(void) { " p NAME print variable / global\n" " set NAME VALUE write VALUE into NAME\n" " x ADDR [count] examine memory (default 16 bytes)\n" + " list FILE:LINE | l FILE:LINE source listing around FILE:LINE\n" " info b list breakpoints\n" " info reg, info registers dump registers\n" " info locals | info args list locals / args at current PC\n" @@ -528,10 +529,44 @@ static int dbg_resolve_loc(DbgState* s, const char* spec, BpKind* kind_out, driver_free(s->env, file, file_size); return 1; } - if (cfree_dwarf_line_to_addr(s->dwarf, file, (uint32_t)line64, &pc) != 0) { - driver_errf(DBG_TOOL, "no line entry for %s", spec); - driver_free(s->env, file, file_size); - return 1; + { + int rc = cfree_dwarf_line_to_addr(s->dwarf, file, (uint32_t)line64, &pc); + if (rc == 1) { + driver_errf(DBG_TOOL, "file not covered by debug info: %s", file); + driver_free(s->env, file, file_size); + return 1; + } + if (rc == 2) { + driver_errf(DBG_TOOL, "no line %u in %s", (uint32_t)line64, file); + driver_free(s->env, file, file_size); + return 1; + } + if (rc == 3) { + /* Ambiguous: enumerate candidates and ask the user to retype + * with a longer suffix. Cap at 8 — more than that, just say so. */ + CfreeDwarfLineMatch cands[8]; + uint32_t n = 0; + uint32_t k; + cfree_dwarf_line_to_addr_all(s->dwarf, file, (uint32_t)line64, cands, + 8u, &n); + driver_errf(DBG_TOOL, "ambiguous: %s:%u matches %u files", + file, (uint32_t)line64, (unsigned)n); + for (k = 0; k < n && k < 8u; ++k) { + driver_errf(DBG_TOOL, " %s (0x%llx)", cands[k].file, + (unsigned long long)cands[k].pc); + } + if (n > 8u) driver_errf(DBG_TOOL, " ... and %u more", n - 8u); + driver_errf(DBG_TOOL, + "use a longer path suffix (e.g. b dir/%s:%u)", file, + (uint32_t)line64); + driver_free(s->env, file, file_size); + return 1; + } + if (rc != 0) { + driver_errf(DBG_TOOL, "no line entry for %s", spec); + driver_free(s->env, file, file_size); + return 1; + } } driver_free(s->env, file, file_size); *kind_out = BP_LINE; @@ -829,11 +864,13 @@ static void dbg_cmd_bt(DbgState* s) { const char* file = NULL; uint32_t line = 0; uint32_t col = 0; - if (cfree_dwarf_addr_to_line(s->dwarf, img_frame.pc, &file, &line, - &col) == 0 && - file) { + int rc = cfree_dwarf_addr_to_line(s->dwarf, img_frame.pc, &file, &line, + &col); + if (rc == 0 && file) { driver_printf(" at %s:%u", file, line); if (col) driver_printf(":%u", col); + } else if (rc == 2) { + driver_printf(" [no debug info for this frame]"); } } driver_printf("\n"); @@ -1099,37 +1136,47 @@ static void dbg_cmd_print(DbgState* s, const char* name) { return; } - if (s->dwarf && - cfree_dwarf_var_at(s->dwarf, dbg_pc_rt_to_img(s, s->last_stop.regs.pc), - name, &loc) == 0) { - dbg_translate_loc(s, &loc); - if (dbg_read_value(s, &loc, &s->last_stop.regs, stack_buf, - sizeof(stack_buf), &buf, &alloc, &got) != 0) { - driver_errf(DBG_TOOL, "could not read %s", name); + { + int rc = s->dwarf + ? cfree_dwarf_var_at( + s->dwarf, dbg_pc_rt_to_img(s, s->last_stop.regs.pc), + name, &loc) + : 1; + if (rc == 0) { + dbg_translate_loc(s, &loc); + if (dbg_read_value(s, &loc, &s->last_stop.regs, stack_buf, + sizeof(stack_buf), &buf, &alloc, &got) != 0) { + driver_errf(DBG_TOOL, "could not read %s", name); + return; + } + driver_printf("%s = ", name); + dbg_print_value(s, loc.type, buf, got, 0); + driver_printf("\n"); + dbg_release_value_buf(s, buf, alloc); return; } - driver_printf("%s = ", name); - dbg_print_value(s, loc.type, buf, got, 0); - driver_printf("\n"); - dbg_release_value_buf(s, buf, alloc); - return; - } - /* DWARF didn't know about it — try a global symbol. */ - { - void* p = cfree_jit_lookup(s->jit, name); - if (p) { - union { - void* p; - uint64_t u; - } cv; - cv.p = p; - driver_printf("%s = 0x%llx (no DWARF type info)\n", name, - (unsigned long long)cv.u); - return; + /* DWARF didn't know about it — try a global symbol. */ + { + void* p = cfree_jit_lookup(s->jit, name); + if (p) { + union { + void* p; + uint64_t u; + } cv; + cv.p = p; + driver_printf("%s = 0x%llx (no DWARF type info)\n", name, + (unsigned long long)cv.u); + return; + } } + if (rc == 2) + driver_errf(DBG_TOOL, + "no debug info for this frame; '%s' not in global symbols", + name); + else + driver_errf(DBG_TOOL, "no variable or symbol named '%s'", name); } - driver_errf(DBG_TOOL, "no variable or symbol named '%s'", name); } /* ============================================================ @@ -1150,11 +1197,20 @@ static void dbg_cmd_set(DbgState* s, const char* name, uint64_t value) { driver_errf(DBG_TOOL, "no program is stopped"); return; } - if (!s->dwarf || - cfree_dwarf_var_at(s->dwarf, dbg_pc_rt_to_img(s, s->last_stop.regs.pc), - name, &loc) != 0) { - driver_errf(DBG_TOOL, "no variable named '%s'", name); - return; + { + int rc = s->dwarf + ? cfree_dwarf_var_at( + s->dwarf, dbg_pc_rt_to_img(s, s->last_stop.regs.pc), + name, &loc) + : 1; + if (rc != 0) { + if (rc == 2) + driver_errf(DBG_TOOL, "no debug info for this frame; cannot set '%s'", + name); + else + driver_errf(DBG_TOOL, "no variable named '%s'", name); + return; + } } dbg_translate_loc(s, &loc); @@ -1370,6 +1426,115 @@ static void dbg_cmd_examine(DbgState* s, uint64_t addr, size_t count) { } /* ============================================================ + * `list file:line` + * ============================================================ + * Print a context window of source lines centered on `file:line`. + * + * Reads `file` from disk via env.file_io. When the file isn't on disk + * (e.g. the DWARF came from a `.o` / `.a` whose source isn't available + * here), report the DWARF line number alone and omit the snippet. */ + +#define DBG_LIST_CTX 5 /* lines printed before/after the target */ + +static void dbg_cmd_list(DbgState* s, const char* spec) { + const char* colon; + size_t flen; + char path[1024]; + uint64_t line_u; + size_t used; + uint64_t pc; + int rc; + CfreeFileData fd; + const CfreeFileIO* io; + + if (!s->dwarf) { + driver_errf(DBG_TOOL, "no DWARF: cannot resolve %s", spec); + return; + } + + colon = driver_strchr(spec, ':'); + if (!colon || !dbg_isdigit((unsigned char)colon[1])) { + driver_errf(DBG_TOOL, "usage: list file:line"); + return; + } + flen = (size_t)(colon - spec); + if (flen == 0 || flen >= sizeof(path)) { + driver_errf(DBG_TOOL, "bad file in '%s'", spec); + return; + } + driver_memcpy(path, spec, flen); + path[flen] = '\0'; + used = dbg_parse_uint(colon + 1, &line_u); + if (!used || colon[1 + used] != '\0') { + driver_errf(DBG_TOOL, "usage: list file:line"); + return; + } + + /* Validate via DWARF first. Maps the new return codes to focused + * REPL messages identical to `b file:line`'s. */ + rc = cfree_dwarf_line_to_addr(s->dwarf, path, (uint32_t)line_u, &pc); + if (rc == 1) { + driver_errf(DBG_TOOL, "file not covered by debug info: %s", path); + return; + } + if (rc == 2) { + driver_errf(DBG_TOOL, "no line %u in %s", (uint32_t)line_u, path); + return; + } + if (rc == 3) { + driver_errf(DBG_TOOL, "ambiguous: %s:%u matches multiple files; " + "use a longer path suffix", + path, (uint32_t)line_u); + return; + } + if (rc != 0) { + driver_errf(DBG_TOOL, "no line entry for %s", spec); + return; + } + + /* Try to read the file via file_io. On miss, fall back to the + * DWARF-only summary line per doc/DBG.md §10. */ + io = s->env && s->env->file_io.read_all ? &s->env->file_io : NULL; + if (!io || !io->read_all(io->user, path, &fd)) { + driver_printf("%s:%u [source not available; pc=0x%llx]\n", path, + (uint32_t)line_u, (unsigned long long)pc); + return; + } + + /* Walk the buffer, counting newlines. Print lines in + * [target-DBG_LIST_CTX, target+DBG_LIST_CTX]. */ + { + uint32_t target = (uint32_t)line_u; + uint32_t lo = target > DBG_LIST_CTX ? target - DBG_LIST_CTX : 1; + uint32_t hi = target + DBG_LIST_CTX; + uint32_t cur = 1; + const uint8_t* p = fd.data; + const uint8_t* end = fd.data + fd.size; + const uint8_t* line_start = p; + while (p <= end) { + int eol = (p == end) || (*p == '\n'); + if (eol) { + if (cur >= lo && cur <= hi) { + size_t len = (size_t)(p - line_start); + driver_printf("%6u%s %.*s\n", cur, cur == target ? " >" : " ", + (int)len, (const char*)line_start); + } + ++cur; + if (p == end) break; + line_start = p + 1; + } + ++p; + } + if (cur <= target) { + driver_errf(DBG_TOOL, "file has only %u lines; %u requested", + cur - 1u, target); + } + } + + if (io->release) io->release(io->user, &fd); +} + +/* ============================================================ * `info b`, `b LOC`, `d N`, `enable N` / `disable N`, `ignore N COUNT` * ============================================================ */ @@ -1526,6 +1691,7 @@ static void dbg_cmd_help(void) { " p NAME print variable / global\n" " set NAME VALUE write VALUE into NAME\n" " x ADDR [count] examine memory (count bytes, default 16)\n" + " list FILE:LINE, l FILE:LINE source listing around FILE:LINE\n" " info b list breakpoints\n" " info reg, info registers dump registers\n" " info locals list locals at current PC\n" @@ -1752,6 +1918,16 @@ static int dbg_dispatch(DbgState* s, char* line) { dbg_cmd_print(s, name); return 0; } + if (driver_streq(cmd, "list") || driver_streq(cmd, "l")) { + char* loc; + dbg_take_word(rest, &loc); + if (!*loc) { + driver_errf(DBG_TOOL, "usage: list file:line"); + return 0; + } + dbg_cmd_list(s, loc); + return 0; + } if (driver_streq(cmd, "x") || driver_streq(cmd, "examine")) { char* addr_s; char* count_s; diff --git a/include/cfree.h b/include/cfree.h @@ -1345,12 +1345,23 @@ void cfree_obj_reliter_free(CfreeObjRelocIter*); * the standard sections wherever they live), or on internal failure. * * cfree_dwarf_addr_to_line maps a runtime / image PC to the source file, - * line, and column that produced it. Returns 0 on success and 1 when the - * PC has no matching .debug_line entry (e.g. compiler scaffolding). - * - * cfree_dwarf_line_to_addr is the inverse: returns 0 on success, 1 when no - * statement-flagged row matches the (file, line) pair. The first matching - * row wins. + * line, and column that produced it. Return codes: + * 0 — PC matched a line entry; outputs filled. + * 1 — PC is inside a CU's address range but no row matched (e.g. + * compiler scaffolding). + * 2 — PC is outside every CU's coverage; the caller is in a frame + * that was compiled without `-g` (REPL renders as "no debug info + * for this frame"). + * + * cfree_dwarf_line_to_addr is the inverse. `file` matches a CU's + * line-table filename exactly, or as a path suffix (`util.c` matches + * `/proj/util.c` but not `/proj/run/futile.c`). Return codes: + * 0 — unique match, pc_out filled. + * 1 — file not present in any CU (REPL: "file not covered"). + * 2 — file present, but no row at `line` (REPL: "no line N in file"). + * 3 — ambiguous: more than one distinct PC matches via suffix; + * pc_out is the first match. Use cfree_dwarf_line_to_addr_all to + * enumerate candidates and prompt for disambiguation. * * cfree_dwarf_func_at returns the enclosing subprogram's name and * inclusive PC bounds. Returns 0 on success, 1 if no subprogram contains @@ -1365,6 +1376,20 @@ int cfree_dwarf_addr_to_line(CfreeDebugInfo*, uint64_t pc, uint32_t* col_out); int cfree_dwarf_line_to_addr(CfreeDebugInfo*, const char* file, uint32_t line, uint64_t* pc_out); + +/* Disambiguation enumerator paired with cfree_dwarf_line_to_addr's + * ambiguous return. `out[k]` is filled for the first `cap` distinct + * candidate PCs; `*n_out` is the total candidate count, which may + * exceed `cap`. `file` strings are interned in the CfreeDebugInfo and + * live until cfree_dwarf_close. Returns 0 on success, 1 on invalid args. */ +typedef struct CfreeDwarfLineMatch { + uint64_t pc; + const char* file; +} CfreeDwarfLineMatch; + +int cfree_dwarf_line_to_addr_all(CfreeDebugInfo*, const char* file, + uint32_t line, CfreeDwarfLineMatch* out, + uint32_t cap, uint32_t* n_out); int cfree_dwarf_func_at(CfreeDebugInfo*, uint64_t pc, const char** name_out, uint64_t* low_pc_out, uint64_t* high_pc_out); @@ -1506,6 +1531,12 @@ typedef struct CfreeDwarfVarLoc { } v; } CfreeDwarfVarLoc; +/* Look up a variable visible at `pc` by name. Return codes: + * 0 — found; *out filled. + * 1 — `pc` is inside a known subprogram but no variable named `name` + * resolves there (typo / out-of-scope). + * 2 — `pc` is not covered by any subprogram (no debug info for this + * frame); globals were still consulted before returning. */ int cfree_dwarf_var_at(CfreeDebugInfo*, uint64_t pc, const char* name, CfreeDwarfVarLoc* out); int cfree_dwarf_loc_read(CfreeDebugInfo*, const CfreeDwarfVarLoc*, diff --git a/src/debug/debug_emit.c b/src/debug/debug_emit.c @@ -159,7 +159,8 @@ typedef struct EmitCtx { u32 nrng_relocs; u32 nrng_relocs_cap; - /* Section ids (filled lazily). */ + /* Section ids (pre-created up front so cross-section relocs can name + * their target before its bytes are written). */ ObjSecId sec_str; ObjSecId sec_line_str; ObjSecId sec_str_off; @@ -168,6 +169,32 @@ typedef struct EmitCtx { ObjSecId sec_line; ObjSecId sec_aranges; ObjSecId sec_rnglists; + + /* SK_SECTION ObjSyms over the same sections. They exist so the CU + * header + root DIE can encode cross-section offsets (debug_abbrev_offset, + * stmt_list, str_offsets_base, ranges) and the line / str-offsets + * payloads can encode their .debug_line_str / .debug_str references + * as relocations. The on-disk u32 stays zero; the relocation's addend + * carries the in-section offset. In a normal `.o` emit the linker + * applies R_ABS32 with S=section_vaddr=0 (debug sections are not laid + * out), so the written value equals the addend — byte-identical to the + * pre-reloc behaviour. In the JIT view, link_jit applies the same + * reloc against the section's accumulated prefix in the merged view, + * so concatenated multi-input debug bytes resolve to the right slot. */ + ObjSymId ssym_str; + ObjSymId ssym_line_str; + ObjSymId ssym_str_off; + ObjSymId ssym_abbrev; + ObjSymId ssym_line; + ObjSymId ssym_rnglists; + + /* Body-relative offsets of the three CU-root-DIE attributes whose + * payloads are cross-section offsets. Captured at the call sites in + * debug_emit() and consumed by emit_section_info() to emit R_ABS32 + * relocs at cu_header_size + <at>. */ + u32 root_stmt_list_at; + u32 root_ranges_at; + u32 root_str_off_base_at; } EmitCtx; /* ---------------------------------------------------------------- */ @@ -661,6 +688,13 @@ static ObjSecId mk_section(EmitCtx* e, const char* name) { return obj_section(e->ob, n, SEC_DEBUG, 0, 1); } +/* Pre-create one SK_SECTION ObjSym pointing at `sec`. Section symbols are + * nameless (Sym 0); identity is the section_id they reference. SB_LOCAL + * because section symbols are always local in ELF/Mach-O. */ +static ObjSymId mk_section_sym(EmitCtx* e, ObjSecId sec) { + return obj_symbol(e->ob, 0, SB_LOCAL, SK_SECTION, sec, 0, 0); +} + static void flatten_to_section(EmitCtx* e, ObjSecId sec, const Buf* src) { u32 total = buf_pos(src); if (total == 0) return; @@ -672,12 +706,10 @@ static void flatten_to_section(EmitCtx* e, ObjSecId sec, const Buf* src) { } static void emit_section_str(EmitCtx* e) { - e->sec_str = mk_section(e, ".debug_str"); flatten_to_section(e, e->sec_str, &e->str.buf); } static void emit_section_line_str(EmitCtx* e) { - e->sec_line_str = mk_section(e, ".debug_line_str"); flatten_to_section(e, e->sec_line_str, &e->line_str.buf); } @@ -685,17 +717,31 @@ static void emit_section_str_offsets(EmitCtx* e) { Buf b; u32 i; u32 unit_length; + u32 entries_off; /* byte offset of first entry within the section */ buf_init(&b, e->heap); unit_length = 4 + e->str.nsyms * 4; /* version+pad + N*4 */ form_u32(&b, unit_length); form_u16(&b, 5); form_u16(&b, 0); + entries_off = buf_pos(&b); for (i = 0; i < e->str.nsyms; ++i) { + /* Write the literal offset (so a bare-`.o` reader that doesn't apply + * relocs still sees the correct value), and *also* emit an R_ABS32 + * reloc against .debug_str with addend = same literal. When the + * linker or JIT view-builder applies the reloc it overwrites the + * slot with S + addend; for a normal `.o` (debug sections not laid + * out) S=0 so the value is unchanged, and for the concatenated JIT + * view S = view-prefix into .debug_str so the slot picks up the + * right per-input offset. */ u32* ofs = SymToU32_get(&e->str.by_sym, e->str.syms[i]); form_u32(&b, ofs ? *ofs : 0); } - e->sec_str_off = mk_section(e, ".debug_str_offsets"); flatten_to_section(e, e->sec_str_off, &b); + for (i = 0; i < e->str.nsyms; ++i) { + u32* ofs = SymToU32_get(&e->str.by_sym, e->str.syms[i]); + obj_reloc(e->ob, e->sec_str_off, entries_off + i * 4u, R_ABS32, + e->ssym_str, (i64)(ofs ? *ofs : 0)); + } buf_fini(&b); } @@ -703,7 +749,6 @@ static void emit_section_abbrev(EmitCtx* e) { Buf b; buf_init(&b, e->heap); abbrev_encode(&e->abbr, &b); - e->sec_abbrev = mk_section(e, ".debug_abbrev"); flatten_to_section(e, e->sec_abbrev, &b); buf_fini(&b); } @@ -729,6 +774,16 @@ static void emit_section_line(EmitCtx* e) { u32 dir_count; Sym* dirs = NULL; u32 ndirs = 0, dirs_cap = 0; + /* Pending line_strp relocs. Each slot is a u32 in hdr_body at + * `slot[k].at` with addend `slot[k].ofs` (the resolved .debug_line_str + * offset). Translated to section offsets and turned into R_ABS32 + * relocs against e->ssym_line_str after we know hdr_body's location + * within the section. */ + struct LineStrpSlot { + u32 at; + u32 ofs; + }* lsp_slots = NULL; + u32 nlsp = 0, lsp_cap = 0; /* aarch64: instructions are 4-byte aligned. DW_LNS_advance_pc takes the * advance in *operations*, which the consumer multiplies by min_inst_length * (DWARF5 §6.2.5.2). Keep this in sync with the value emitted into the @@ -850,7 +905,14 @@ static void emit_section_line(EmitCtx* e) { dir_count = ndirs; form_uleb(&hdr_body, dir_count); for (i = 0; i < dir_count; ++i) { - form_u32(&hdr_body, line_str_offset(e, dirs[i])); + u32 at = buf_pos(&hdr_body); + u32 ofs = line_str_offset(e, dirs[i]); + form_u32(&hdr_body, ofs); /* literal; also bound to a reloc below */ + if (!VEC_GROW(e->heap, lsp_slots, lsp_cap, nlsp + 1)) { + lsp_slots[nlsp].at = at; + lsp_slots[nlsp].ofs = ofs; + nlsp++; + } } /* file_name_entry_format: 2 entries */ @@ -861,15 +923,31 @@ static void emit_section_line(EmitCtx* e) { form_uleb(&hdr_body, DW_FORM_udata); if (e->d->nfiles == 0) { + u32 at; + u32 ofs; form_uleb(&hdr_body, 1); - form_u32(&hdr_body, line_str_offset(e, pool_intern_cstr(pool, ""))); + at = buf_pos(&hdr_body); + ofs = line_str_offset(e, pool_intern_cstr(pool, "")); + form_u32(&hdr_body, ofs); + if (!VEC_GROW(e->heap, lsp_slots, lsp_cap, nlsp + 1)) { + lsp_slots[nlsp].at = at; + lsp_slots[nlsp].ofs = ofs; + nlsp++; + } form_uleb(&hdr_body, 0); } else { form_uleb(&hdr_body, e->d->nfiles); for (i = 0; i < e->d->nfiles; ++i) { DebugFile* df = &e->d->files[i]; u32 di; - form_u32(&hdr_body, line_str_offset(e, df->base)); + u32 at = buf_pos(&hdr_body); + u32 ofs = line_str_offset(e, df->base); + form_u32(&hdr_body, ofs); + if (!VEC_GROW(e->heap, lsp_slots, lsp_cap, nlsp + 1)) { + lsp_slots[nlsp].at = at; + lsp_slots[nlsp].ofs = ofs; + nlsp++; + } for (di = 0; di < ndirs; ++di) { if (dirs[di] == df->dir) break; } @@ -909,18 +987,26 @@ static void emit_section_line(EmitCtx* e) { } if (tmp) e->heap->free(e->heap, tmp, plen ? plen : 1); } - e->sec_line = mk_section(e, ".debug_line"); flatten_to_section(e, e->sec_line, &out); - /* program-start in section bytes = 12 (unit_length+ver+addr+seg+hl) + hl */ + /* program-start in section bytes = 12 (unit_length+ver+addr+seg+hl) + hl. + * hdr_body sits at section offset 12 (right after the unit header), + * so a line_strp slot at hdr_body offset `at` is at section offset + * `12 + at`. */ { u32 prog_start = 12 + hl; + u32 hdr_start = 12; u32 k; for (k = 0; k < e->nline_relocs; ++k) { obj_reloc(e->ob, e->sec_line, prog_start + e->line_relocs[k].buf_offset, R_ABS64, e->line_relocs[k].sym, 0); } + for (k = 0; k < nlsp; ++k) { + obj_reloc(e->ob, e->sec_line, hdr_start + lsp_slots[k].at, R_ABS32, + e->ssym_line_str, (i64)lsp_slots[k].ofs); + } } } + if (lsp_slots) e->heap->free(e->heap, lsp_slots, sizeof(*lsp_slots) * lsp_cap); buf_fini(&prog); buf_fini(&hdr_body); buf_fini(&out); @@ -982,7 +1068,6 @@ static void emit_section_aranges(EmitCtx* e) { le[3] = (u8)((unit_length >> 24) & 0xff); buf_patch(&b, 0, le, 4); } - e->sec_aranges = mk_section(e, ".debug_aranges"); flatten_to_section(e, e->sec_aranges, &b); for (i = 0; i < e->naranges_relocs; ++i) { obj_reloc(e->ob, e->sec_aranges, e->aranges_relocs[i].buf_offset, R_ABS64, @@ -1025,7 +1110,6 @@ static void emit_section_rnglists(EmitCtx* e) { le[3] = (u8)((unit_length >> 24) & 0xff); buf_patch(&b, 0, le, 4); } - e->sec_rnglists = mk_section(e, ".debug_rnglists"); flatten_to_section(e, e->sec_rnglists, &b); for (i = 0; i < e->nrng_relocs; ++i) { obj_reloc(e->ob, e->sec_rnglists, e->rng_relocs[i].buf_offset, R_ABS64, @@ -1045,7 +1129,7 @@ static void emit_section_info(EmitCtx* e) { form_u16(&out, 5); form_u8(&out, DW_UT_compile); form_u8(&out, e->d->c->target.ptr_size); - form_u32(&out, 0); /* debug_abbrev_offset */ + form_u32(&out, 0); /* debug_abbrev_offset — filled by R_ABS32 reloc below */ /* Append body */ { u32 plen = body_size; @@ -1056,8 +1140,20 @@ static void emit_section_info(EmitCtx* e) { } if (tmp) e->heap->free(e->heap, tmp, plen ? plen : 1); } - e->sec_info = mk_section(e, ".debug_info"); flatten_to_section(e, e->sec_info, &out); + /* CU header cross-section refs: debug_abbrev_offset at byte 8. Root + * DIE cross-section refs (stmt_list / ranges / str_offsets_base) live + * at body offsets captured during CU-body construction; section offset + * is cu_header_size + body offset. Addend carries the in-target + * offset (0 for abbrev/stmt_list, 12 for rnglists past its header, 8 + * for str_offsets past its header). */ + obj_reloc(e->ob, e->sec_info, 8u, R_ABS32, e->ssym_abbrev, 0); + obj_reloc(e->ob, e->sec_info, cu_header_size + e->root_stmt_list_at, + R_ABS32, e->ssym_line, 0); + obj_reloc(e->ob, e->sec_info, cu_header_size + e->root_ranges_at, + R_ABS32, e->ssym_rnglists, 12); + obj_reloc(e->ob, e->sec_info, cu_header_size + e->root_str_off_base_at, + R_ABS32, e->ssym_str_off, 8); /* Apply forward DIE refs (DW_FORM_ref4 = CU-relative, where the CU * starts at the unit_length field. body offset 0 is at section * offset cu_header_size = 12 (post-header, post-unit_length). DW5 @@ -1119,6 +1215,28 @@ void debug_emit(Debug* d) { resolve_abbrevs(&ec); + /* Pre-create every debug section + a paired SK_SECTION ObjSym, before + * any DIE/program payload is emitted. Cross-section relocations + * (CU-header debug_abbrev_offset, root-DIE stmt_list / ranges / + * str_offsets_base, .debug_line line_strp slots, .debug_str_offsets + * entries) name these symbols, so they must exist by the time the + * relocs are recorded. Section order in the output `.o` is fixed by + * obj_section call order and matches the previous emission. */ + ec.sec_abbrev = mk_section(&ec, ".debug_abbrev"); + ec.sec_line = mk_section(&ec, ".debug_line"); + ec.sec_aranges = mk_section(&ec, ".debug_aranges"); + ec.sec_rnglists = mk_section(&ec, ".debug_rnglists"); + ec.sec_info = mk_section(&ec, ".debug_info"); + ec.sec_str = mk_section(&ec, ".debug_str"); + ec.sec_line_str = mk_section(&ec, ".debug_line_str"); + ec.sec_str_off = mk_section(&ec, ".debug_str_offsets"); + ec.ssym_abbrev = mk_section_sym(&ec, ec.sec_abbrev); + ec.ssym_line = mk_section_sym(&ec, ec.sec_line); + ec.ssym_rnglists = mk_section_sym(&ec, ec.sec_rnglists); + ec.ssym_str = mk_section_sym(&ec, ec.sec_str); + ec.ssym_line_str = mk_section_sym(&ec, ec.sec_line_str); + ec.ssym_str_off = mk_section_sym(&ec, ec.sec_str_off); + producer_sym = pool_intern_cstr(pool, "cfree 0.1"); if (d->nfiles > 0) { primary_dir = d->files[0].dir; @@ -1134,14 +1252,22 @@ void debug_emit(Debug* d) { form_u16(&ec.info_body, DW_LANG_C11); emit_strx4(&ec, &ec.info_body, primary_base); emit_strx4(&ec, &ec.info_body, primary_dir); - form_u32(&ec.info_body, 0); /* DW_AT_stmt_list */ + /* DW_AT_stmt_list → offset 0 in .debug_line. Write the literal so a + * bare-`.o` reader still sees the correct value; the paired R_ABS32 + * reloc emitted in emit_section_info() overwrites the slot in the + * JIT view path where multiple inputs' .debug_line sections are + * concatenated. */ + ec.root_stmt_list_at = buf_pos(&ec.info_body); + form_u32(&ec.info_body, 0); { u8 z[8] = {0}; buf_write(&ec.info_body, z, d->c->target.ptr_size); } - /* DW_AT_ranges → start of the body of .debug_rnglists, post-12-byte hdr. */ + /* DW_AT_ranges → 12 bytes into .debug_rnglists, post header. */ + ec.root_ranges_at = buf_pos(&ec.info_body); form_u32(&ec.info_body, 12); - /* DW_AT_str_offsets_base → 8 bytes into .debug_str_offsets (skip hdr). */ + /* DW_AT_str_offsets_base → 8 bytes into .debug_str_offsets, post header. */ + ec.root_str_off_base_at = buf_pos(&ec.info_body); form_u32(&ec.info_body, 8); for (i = 0; i < d->ntypes; ++i) emit_type_die(&ec, (DebugTypeId)(i + 1)); diff --git a/src/dwarf/dwarf_line.c b/src/dwarf/dwarf_line.c @@ -442,7 +442,14 @@ void dw_build_line(CfreeDebugInfo* d, u32 cu_idx) { int cfree_dwarf_addr_to_line(CfreeDebugInfo* d, uint64_t pc, const char** file_out, uint32_t* line_out, uint32_t* col_out) { + /* Return codes: + * 0 — PC has a line entry; outputs filled. + * 1 — PC sits inside a CU's coverage range but no row matched. + * 2 — PC outside every CU's address coverage (e.g. JIT-emitted thunk + * or a frame inside a `.o` linked without `-g`). REPL: "no + * debug info for this frame". */ u32 i; + int any_in_range = 0; if (file_out) *file_out = NULL; if (line_out) *line_out = 0; if (col_out) *col_out = 0; @@ -451,16 +458,18 @@ int cfree_dwarf_addr_to_line(CfreeDebugInfo* d, uint64_t pc, DwLineProgram* lp; u32 j; DwLineRow* best = NULL; + uint64_t cu_lo = (uint64_t)-1, cu_hi = 0; if (!d->lines_built[i]) dw_build_line(d, i); lp = &d->lines_by_cu[i]; - /* Find the latest row with address <= pc that is in a valid sequence - * (sequence ends at end_sequence==1). */ for (j = 0; j < lp->nrows; ++j) { DwLineRow* r = &lp->rows[j]; + if (r->address < cu_lo) cu_lo = r->address; + if (r->address > cu_hi) cu_hi = r->address; if (r->end_sequence) continue; if (r->address > pc) break; best = r; } + if (pc >= cu_lo && pc <= cu_hi) any_in_range = 1; if (best) { const char* f = ""; if (best->file_index < lp->nfile_norm && lp->file_norm) @@ -471,14 +480,49 @@ int cfree_dwarf_addr_to_line(CfreeDebugInfo* d, uint64_t pc, return 0; } } - return 1; + return any_in_range ? 1 : 2; +} + +/* file_norm matches user-typed `file` if either it is exactly equal, or it + * ends with `/<file>`. Suffix matching keeps `b util.c:42` working when + * the DWARF file_norm is the absolute path the compiler saw. */ +static int dw_file_matches(const char* file_norm, const char* user, size_t ulen) { + size_t flen; + if (!file_norm) return 0; + if (dw_streq(file_norm, user)) return 1; + flen = strlen(file_norm); + if (flen <= ulen) return 0; + if (file_norm[flen - ulen - 1] != '/') return 0; + return memcmp(file_norm + flen - ulen, user, ulen) == 0; } int cfree_dwarf_line_to_addr(CfreeDebugInfo* d, const char* file, uint32_t line, uint64_t* pc_out) { + /* Returns: + * 0 — unique match; pc_out filled with that PC. + * 1 — file `file` does not appear in any CU we scanned (per-DWARF.md + * "no data" semantics: caller can format this as "file not + * covered" if it cares to distinguish from a stale line). + * 2 — `file` appears in some CU but no row matches (file, line). + * 3 — ambiguous: more than one distinct PC matches (file, line) via + * suffix. pc_out is filled with the first match so callers that + * don't disambiguate still get a usable PC. Use + * cfree_dwarf_line_to_addr_all to enumerate candidates. */ + /* Ambiguity is keyed on distinct file_norm *paths* matching the + * suffix, not on distinct PCs. Multiple PCs on the same line of the + * same source file are expected (one row per instruction) — they're + * not ambiguity, just line-program granularity. */ u32 i; + size_t ulen; + const char* first_path = NULL; + uint64_t first_pc = 0; + const char* alt_path = NULL; + int file_seen = 0; + int line_hits = 0; if (pc_out) *pc_out = 0; if (!d || !file) return 1; + ulen = strlen(file); + if (ulen == 0) return 1; for (i = 0; i < d->ncus; ++i) { DwLineProgram* lp; u32 j; @@ -488,14 +532,80 @@ int cfree_dwarf_line_to_addr(CfreeDebugInfo* d, const char* file, uint32_t line, DwLineRow* r = &lp->rows[j]; const char* f; if (r->end_sequence) continue; - if (r->line != line) continue; if (r->file_index >= lp->nfile_norm || !lp->file_norm) continue; f = lp->file_norm[r->file_index]; - if (!f) continue; - if (!dw_streq(f, file)) continue; - if (pc_out) *pc_out = r->address; - return 0; + if (!dw_file_matches(f, file, ulen)) continue; + file_seen = 1; + if (r->line != line) continue; + ++line_hits; + if (!first_path) { + first_path = f; + first_pc = r->address; + } else if (!alt_path && f != first_path && !dw_streq(f, first_path)) { + alt_path = f; + } } } + if (pc_out) *pc_out = first_pc; + if (alt_path) return 3; + if (line_hits > 0) return 0; + if (file_seen) return 2; return 1; } + +/* Enumerate all distinct candidate (pc, file_norm) pairs for the given + * (file, line) match. Caller-supplied `out` array is filled up to `cap`; + * `*n_out` receives the total candidate count (which may exceed cap, in + * which case only the first `cap` are written). Returns 0 on success + * (including 0 candidates), 1 on invalid args. Intended for REPL + * disambiguation after cfree_dwarf_line_to_addr returns 3. */ +int cfree_dwarf_line_to_addr_all(CfreeDebugInfo* d, const char* file, + uint32_t line, CfreeDwarfLineMatch* out, + uint32_t cap, uint32_t* n_out) { + /* One candidate per distinct file_norm path (not per PC). PC is the + * first matching row's address for that file_norm — i.e. the same PC + * that cfree_dwarf_line_to_addr would have returned for that file. */ + u32 i; + size_t ulen; + uint32_t total = 0; + if (n_out) *n_out = 0; + if (!d || !file) return 1; + ulen = strlen(file); + if (ulen == 0) return 1; + for (i = 0; i < d->ncus; ++i) { + DwLineProgram* lp; + u32 j; + if (!d->lines_built[i]) dw_build_line(d, i); + lp = &d->lines_by_cu[i]; + for (j = 0; j < lp->nrows; ++j) { + DwLineRow* r = &lp->rows[j]; + const char* f; + uint32_t k; + int dup = 0; + if (r->end_sequence) continue; + if (r->line != line) continue; + if (r->file_index >= lp->nfile_norm || !lp->file_norm) continue; + f = lp->file_norm[r->file_index]; + if (!dw_file_matches(f, file, ulen)) continue; + /* Dedupe by file_norm path so the candidate list is one entry per + * source file even if the line has many per-instruction rows. */ + if (out) { + uint32_t lim = total < cap ? total : cap; + for (k = 0; k < lim; ++k) { + if (out[k].file == f || (out[k].file && dw_streq(out[k].file, f))) { + dup = 1; + break; + } + } + } + if (dup) continue; + if (out && total < cap) { + out[total].pc = r->address; + out[total].file = f; + } + ++total; + } + } + if (n_out) *n_out = total; + return 0; +} diff --git a/src/dwarf/dwarf_query.c b/src/dwarf/dwarf_query.c @@ -118,6 +118,14 @@ static void fill_varloc(CfreeDebugInfo* d, u32 cu_idx, const DwLocal* v, u64 pc, int cfree_dwarf_var_at(CfreeDebugInfo* d, uint64_t pc, const char* name, CfreeDwarfVarLoc* out) { + /* Return codes: + * 0 — found; *out filled. + * 1 — invalid args, or `pc` lies inside a known subprogram but no + * variable named `name` is visible there (the user typo case). + * 2 — `pc` is not covered by any subprogram (no debug info for this + * frame). REPL: "no debug info for this frame". Globals are + * still consulted before returning 2 so a name lookup against a + * global from a -g-less frame still resolves. */ DwSubprog* sp; u32 i; if (!d || !name || !out) return 1; @@ -150,7 +158,7 @@ int cfree_dwarf_var_at(CfreeDebugInfo* d, uint64_t pc, const char* name, fill_varloc(d, 0, v, pc, out); return 0; } - return 1; + return sp ? 1 : 2; } int cfree_dwarf_loc_read(CfreeDebugInfo* d, const CfreeDwarfVarLoc* loc, diff --git a/src/link/link_jit.c b/src/link/link_jit.c @@ -418,19 +418,23 @@ static int jit_view_is_debug_name(const char* name) { } /* True if input `ii` carries any debug section that's worth surfacing. - * Cheap walk over the input's section table. */ + * Cheap walk over the input's section table. Sym values are pool-local, + * so name strings must be dereferenced through the input's *own* + * compiler pool — not the jit's pool. */ static int jit_view_input_has_debug(CfreeJit* jit, u32 ii) { ObjBuilder* ob; + Pool* in_pool; u32 nsec, k; if (ii >= jit->image->dbg_objs_n) return 0; ob = jit->image->dbg_objs[ii]; if (!ob) return 0; + in_pool = obj_compiler(ob)->global; nsec = obj_section_count(ob); for (k = 0; k < nsec; ++k) { const Section* s = obj_section_get(ob, (ObjSecId)(k + 1)); const char* nm; if (!s || !s->name) continue; - nm = pool_str(jit->c->global, s->name, NULL); + nm = pool_str(in_pool, s->name, NULL); if (jit_view_is_debug_name(nm)) return 1; } return 0; @@ -456,33 +460,90 @@ static u64 jit_view_sym_vaddr(CfreeJit* jit, u32 ii, ObjSymId obj_sym) { return s->vaddr; /* image-relative — what DWARF was emitted in */ } -/* Copy one debug section from input `ii` into `view_ob` with its - * relocations applied against final image vaddrs. Relocations in - * .debug_* are almost universally R_ABS{32,64} against a code symbol; - * link_reloc_apply ignores its P argument for those kinds, so we pass - * P=0. */ +/* Per-output-section state used while concatenating multi-input debug + * bytes. Keyed by interned name in the view compiler's pool. */ +typedef struct ViewSec { + Sym view_name; /* interned in view_ob's compiler pool */ + ObjSecId view_id; + u32 cur_size; /* bytes currently in the view section */ + /* Snapshot of cur_size taken at the start of processing the current + * input. Reloc apply for SK_SECTION targets must use this value so + * that intra-input references resolve to *this* input's contribution, + * not bytes appended later from the same input. */ + u32 snap; +} ViewSec; + +/* Find-or-create a ViewSec entry by debug-section name. */ +static ViewSec* view_sec_for(CfreeJit* jit, ViewSec* tab, u32* ntab, + u32* cap_inout, const char* name, u16 flags, + u32 align, u32 entsize, ObjBuilder* view_ob, + ViewSec** tab_out) { + Heap* h = (Heap*)jit->c->env->heap; + Pool* view_pool = obj_compiler(view_ob)->global; + Sym vn = pool_intern_cstr(view_pool, name); + u32 i; + for (i = 0; i < *ntab; ++i) { + if (tab[i].view_name == vn) { + if (tab_out) *tab_out = tab; + return &tab[i]; + } + } + if (*ntab == *cap_inout) { + u32 ncap = *cap_inout ? *cap_inout * 2u : 4u; + ViewSec* na = (ViewSec*)h->realloc(h, tab, *cap_inout * sizeof(*tab), + ncap * sizeof(*tab), _Alignof(ViewSec)); + if (!na) return NULL; + tab = na; + *cap_inout = ncap; + if (tab_out) *tab_out = tab; + } else if (tab_out) { + *tab_out = tab; + } + { + ViewSec* slot = &tab[(*ntab)++]; + slot->view_name = vn; + slot->view_id = + obj_section_ex(view_ob, vn, SEC_DEBUG, SSEM_PROGBITS, flags, + align ? align : 1u, entsize, 0, 0); + slot->cur_size = 0; + slot->snap = 0; + return slot; + } +} + +/* Find a ViewSec by view-pool name (no creation). Returns NULL on miss. */ +static ViewSec* view_sec_find(ViewSec* tab, u32 ntab, Sym view_name) { + u32 i; + for (i = 0; i < ntab; ++i) + if (tab[i].view_name == view_name) return &tab[i]; + return NULL; +} + +/* Copy one debug section from input `ii` into the view, applying its + * relocations against either the view-relative section prefix (for + * SK_SECTION targets pointing at a debug section) or the final image + * vaddr (for code/data symbol targets like DW_AT_low_pc). + * + * `tab` is the find-or-create table of view sections, keyed by name; + * snapshots taken at the start of this input are read from it via + * view_sec_find. */ static void jit_view_copy_debug_section(CfreeJit* jit, u32 ii, - ObjSecId in_sec_id, - ObjBuilder* view_ob) { + ObjSecId in_sec_id, ObjBuilder* view_ob, + ViewSec* tab, u32 ntab, + ViewSec* out_vs) { ObjBuilder* in_ob = jit->image->dbg_objs[ii]; const Section* in_sec = obj_section_get(in_ob, in_sec_id); + Pool* in_pool; + Pool* view_pool = obj_compiler(view_ob)->global; Heap* h; u32 nbytes, k, total_relocs; - const char* nm; - Sym view_name; - ObjSecId out_id; u8* bytes; if (!in_sec) return; nbytes = in_sec->bytes.total; if (nbytes == 0) return; + in_pool = obj_compiler(in_ob)->global; h = (Heap*)jit->c->env->heap; - - nm = pool_str(jit->c->global, in_sec->name, NULL); - if (!nm) return; - view_name = pool_intern_cstr(obj_compiler(view_ob)->global, nm); - out_id = obj_section_ex(view_ob, view_name, SEC_DEBUG, SSEM_PROGBITS, - in_sec->flags, in_sec->align ? in_sec->align : 1u, - in_sec->entsize, 0, 0); + (void)in_pool; bytes = (u8*)h->alloc(h, nbytes, 1); if (!bytes) return; @@ -493,10 +554,34 @@ static void jit_view_copy_debug_section(CfreeJit* jit, u32 ii, total_relocs = obj_reloc_total(in_ob); for (k = 0; k < total_relocs; ++k) { const Reloc* r = obj_reloc_at(in_ob, k); - u64 S; + u64 S = 0; + int handled = 0; if (!r || r->section_id != in_sec_id) continue; if (r->offset >= nbytes) continue; /* malformed; skip */ - S = jit_view_sym_vaddr(jit, ii, r->sym); + /* SK_SECTION target → resolve against the per-input snapshot of the + * matching view section's prefix size. This is what makes the + * concatenated multi-input view's cross-section offsets land in + * the right slot. */ + if (r->sym != OBJ_SYM_NONE) { + const ObjSym* ts = obj_symbol_get(in_ob, r->sym); + if (ts && ts->kind == SK_SECTION && ts->section_id != OBJ_SEC_NONE) { + const Section* tsec = obj_section_get(in_ob, ts->section_id); + if (tsec && tsec->kind == SEC_DEBUG && tsec->name) { + size_t tnlen = 0; + const char* tnm = pool_str(obj_compiler(in_ob)->global, tsec->name, + &tnlen); + if (tnm) { + Sym v_tn = pool_intern(view_pool, tnm, tnlen); + ViewSec* tgt = view_sec_find(tab, ntab, v_tn); + if (tgt) { + S = (u64)tgt->snap; + handled = 1; + } + } + } + } + } + if (!handled) S = jit_view_sym_vaddr(jit, ii, r->sym); /* P is unused by ABS kinds; PC-relative debug-section relocs are * not produced by cfree's debug emitter, but if some external .o * carried one against a debug section, P=0 would give a @@ -506,34 +591,33 @@ static void jit_view_copy_debug_section(CfreeJit* jit, u32 ii, r->addend, 0); } - obj_write(view_ob, out_id, bytes, nbytes); + obj_write(view_ob, out_vs->view_id, bytes, nbytes); + out_vs->cur_size += nbytes; h->free(h, bytes, nbytes); } -/* Build the view on first call. Returns NULL if no input carries - * debug info, or if more than one does (v1 doesn't concatenate cross- - * CU references; see doc/DBG.md §12). */ +/* Build the view on first call. Walks every input that carries a debug + * section and concatenates per-section bytes into the view's matching + * sections. SK_SECTION relocations are resolved against the view-side + * prefix length snapshotted at the start of each input, so the merged + * CU2/CU3/... cross-section offsets land in the right slot. */ static CfreeObjFile* jit_view_build(CfreeJit* jit) { - u32 i, dbg_input_ii = UINT32_MAX, n_with_debug = 0; CfreeObjFile* view; ObjBuilder* view_ob; - u32 nsec, k; + ViewSec* tab = NULL; + u32 ntab = 0, cap = 0; + Heap* h; + u32 ii, k; + int any = 0; if (!jit->image || jit->image->dbg_objs_n == 0) return NULL; - - for (i = 0; i < jit->image->dbg_objs_n; ++i) { - if (jit_view_input_has_debug(jit, i)) { - dbg_input_ii = i; - ++n_with_debug; + for (ii = 0; ii < jit->image->dbg_objs_n; ++ii) { + if (jit_view_input_has_debug(jit, ii)) { + any = 1; + break; } } - if (n_with_debug == 0) return NULL; - if (n_with_debug > 1) { - /* v1 limitation: cross-CU offset adjustment for concatenated - * .debug_abbrev / .debug_str / .debug_str_offsets isn't wired up - * yet. Single-TU dbg sessions are the supported shape. */ - return NULL; - } + if (!any) return NULL; view = cfree_objfile_empty_new(jit->c->env, jit->c->target, jit->c->target.obj); @@ -543,18 +627,53 @@ static CfreeObjFile* jit_view_build(CfreeJit* jit) { cfree_obj_close(view); return NULL; } + h = (Heap*)jit->c->env->heap; - nsec = obj_section_count(jit->image->dbg_objs[dbg_input_ii]); - for (k = 0; k < nsec; ++k) { - const Section* s = obj_section_get(jit->image->dbg_objs[dbg_input_ii], - (ObjSecId)(k + 1)); - const char* nm; - if (!s || !s->name) continue; - nm = pool_str(jit->c->global, s->name, NULL); - if (!jit_view_is_debug_name(nm)) continue; - jit_view_copy_debug_section(jit, dbg_input_ii, (ObjSecId)(k + 1), view_ob); + for (ii = 0; ii < jit->image->dbg_objs_n; ++ii) { + ObjBuilder* in_ob; + u32 nsec; + if (!jit_view_input_has_debug(jit, ii)) continue; + in_ob = jit->image->dbg_objs[ii]; + + /* Phase A: find-or-create every debug section this input contributes + * to, and snapshot cur_size as the per-input prefix. */ + nsec = obj_section_count(in_ob); + for (k = 0; k < nsec; ++k) { + const Section* s = obj_section_get(in_ob, (ObjSecId)(k + 1)); + const char* nm; + size_t nlen = 0; + ViewSec* vs; + if (!s || !s->name) continue; + nm = pool_str(obj_compiler(in_ob)->global, s->name, &nlen); + if (!nm || !jit_view_is_debug_name(nm)) continue; + vs = view_sec_for(jit, tab, &ntab, &cap, nm, s->flags, + s->align ? s->align : 1u, s->entsize, view_ob, &tab); + if (!vs) continue; + } + for (k = 0; k < ntab; ++k) tab[k].snap = tab[k].cur_size; + + /* Phase B: copy each debug section + apply relocs. cur_size grows + * as bytes get appended; SK_SECTION resolution always reads `snap` + * (the start-of-input snapshot), so intra-input references inside + * any debug section still land in this input's slice. */ + for (k = 0; k < nsec; ++k) { + const Section* s = obj_section_get(in_ob, (ObjSecId)(k + 1)); + const char* nm; + size_t nlen = 0; + Sym v_nm; + ViewSec* vs; + if (!s || !s->name) continue; + nm = pool_str(obj_compiler(in_ob)->global, s->name, &nlen); + if (!nm || !jit_view_is_debug_name(nm)) continue; + v_nm = pool_intern(obj_compiler(view_ob)->global, nm, nlen); + vs = view_sec_find(tab, ntab, v_nm); + if (!vs) continue; + jit_view_copy_debug_section(jit, ii, (ObjSecId)(k + 1), view_ob, tab, + ntab, vs); + } } + if (tab) h->free(h, tab, sizeof(*tab) * cap); obj_finalize(view_ob); return view; }