00_registry

registry

Approach Registry Architecture

Purpose

The ARES Approach Registry is the "Omni-Database" of the compiler, containing the formal definitions, cross-language templates, and complexity metadata for every high-level algorithm supported by the DSL. It functions as a semantic bridge that allows the developer to specify what to do (e.g., use dijkstra) while the registry handles the how for each target language. This abstraction enables the ARES compiler to remain "Universal" and "Meta," as adding support for a new algorithm only requires adding a entry to the registry.

Prerequisites

  • Approach Metadata: Each algorithm must be defined with a name, description, layer, and complexity.
  • Target Templates: Emitters require target-specific strings (C++, Python, TS) for code inlining.
  • Dependency List: UMR requires a list of external libraries required by the approach.

Dependencies

  • src/registry/approach_registry.ts: The central management class.
  • Modular Categories:
    • cp_core.ts
    • cp_advanced.ts
    • swe_systems.ts
    • data_science.ts
    • cp_specialized.ts
  • src/types.ts: Defines the ApproachInfo structure.

Formal Definition

Let R\mathcal{R} be the Approach Registry. R\mathcal{R} is a set of ApproachInfo objects {a1,a2,...,an}\{ a_1, a_2, ..., a_n \}. An ApproachInfo aa is a tuple: a=(N,D,L,C,Deps,Tcpp,Tpy,Tts)a = (N, D, L, C, \text{Deps}, T_{cpp}, T_{py}, T_{ts})

Where:

  • NN: Canonical name (and a set of aliases).
  • DD: Human-readable description.
  • LL: Registry layer (1: Primitive, 2: Utility, 3: Full Algorithm).
  • CC: Big-O Complexity (used for Phase 5 routing).
  • Deps\text{Deps}: Required external modules.
  • TT: Language-specific emission templates.

Operational Behavior

  1. Lazy Categorization: Approaches are grouped into functional categories (Core, Advanced, SWE, DS) to ensure modularity and ease of maintenance.
  2. Alias Resolution: The registry supports multiple names for the same algorithm (e.g., prefix_sum might be aliased as cumulative_sum), allowing for flexible developer input.
  3. Template Substitution: Templates use double-curly braces (e.g., {{target}}, {{params.x}}) for dynamic injection of AST variables and parameters during emission.
  4. Canonical Search: The search() method allows the Semantic Analyzer to provide "Did you mean?" suggestions when an unknown approach is used.
  5. Multi-Language Parity: The registry enforces that every approach has a template for each supported backend, though some may be no-ops or warnings if the language is unsuitable for that algorithm.

Implementation Mapping

Implementation details in src/registry/approach_registry.ts:

  • Internal Map: approaches (canonical names) and aliases (alternate names) (Lines 12-13).
  • Registry Initialization: initializeRegistry() (Line 19) loads all modular category files.
  • Base Primitives: Lines 28-61 define built-in primitives like sort and binary_search directly in the registry.
  • Search API: search(query) (Line 115) performs fuzzy matching against name, description, and aliases.
  • Categorical Retrieval: getByCategory(category) (Line 103) filters the registry for specific sub-domains.

Mathematical Model

The search function S(q)S(q) for a query string qq is defined as: S(q)={aRqname(a)qdesc(a)qalias(a)}S(q) = \{ a \in \mathcal{R} \mid q \sqsubset \text{name}(a) \lor q \sqsubset \text{desc}(a) \lor q \sqsubset \text{alias}(a) \} Where \sqsubset represents the substantive substring inclusion relation.

Complexity

  • Initialization Time: O(N)O(N) where NN is the total number of approaches (typically <10< 10ms for hundreds of entries).
  • Lookup Time: O(1)O(1) constant-time access via the Internal Map.
  • Search Time: O(N)O(N) linear scan of the registry.

Examples

Approach Definition:

typescript
{ name: "prefix_sum", aliases: ["pre_sum", "cumulative_sum"], complexity: "O(N)", cpp_template: "// C++ implementation...", py_template: "import itertools\n{{target}} = list(itertools.accumulate({{target}}))" }

Traces

Trace for use binary_search on data with target=x:

  1. Generator calls registry.get("binary_search").
  2. Returns the ApproachInfo for Binary Search.
  3. Emitter finds cpp_template.
  4. Replaces {{target}} with data.
  5. Replaces {{params.target}} with x.
  6. Emits the final C++ code to the source buffer.

Edge Cases

  • Case Insensitivity: All lookups and search queries are neutralized to lowercase to prevent stylistic mismatch errors.
  • Ambiguous Aliases: If two approaches share the same alias, the registry priority is determined by the order of registration (Modular categories after Core categories).
  • Missing Templates: If an emitter requests a template for a language that wasn't provided, it falls back to a descriptive comment in the generated code.

Failure Modes

  • Duplicate Names: Registering an approach with a name that already exists over-writes the previous entry.
  • Invalid Placeholders: Using a placeholder in a template that isn't provided by the parser (e.g., {{params.y}} when only x was passed) results in the literal placeholder remaining in the output.

Compatibility

  • Extensibility: New approaches can be added by simply creating a new .ts category file and importing it into the main registry.

Testing

  • Parity Verification: tests/registry.test.ts iterates through the entire registry and ensures that every entry has a non-empty template for C++, Python, and TypeScript.
  • Complexity Correctness: Verifies that complexity strings follow the standard Big-O notation recognized by the Profiler.

Related Entities

  • src/semantics/analyzer.ts: The primary consumer of registry search functionality.
  • 6_compiler_and_runtime/04_cpp_emitter.md: Uses these templates for emission.
  • 8_registry_approaches/01_cp_core.md: Deep dive into the core CP algorithms.

Source Attribution

  • Primary Source: src/registry/approach_registry.ts.
  • Patterns: Inspired by the "Strategy Pattern" in software architecture.
ARES