CP Core Approaches Specification

Purpose

The CP Core category contains the fundamental algorithmic patterns used in competitive programming and general software engineering. These approaches focus on efficient array manipulation, sequence processing, basic search techniques, and number theory. Each approach is designed to provide optimal time complexity for its specific sub-problem, serving as the building blocks for more complex ARES programs.

Prerequisites

Registry Access: These approaches are indexed under the cp_core category.
Variable Binding: Target variables must be valid containers (for array approaches) or numeric types (for number theory).

Dependencies

src/registry/categories/cp_core.ts: The implementation source.
Target STL / Built-ins: Requires <algorithm> (C++), bisect (Python), and Math (TS).

Formal Definition

The CP Core category $\mathcal{C}_{core}$ is a subset of the Approach Registry $\mathcal{R}$ . Key sub-categories include:

Array & Sequence: $\text{PrefixSum}, \text{SlidingWindow}, \text{TwoPointers}$
Search: $\text{BinarySearch}, \text{TernarySearch}, \text{ParametricSearch}$
Greedy: $\text{IntervalScheduling}, \text{IntervalMerge}$
Number Theory: $\text{GCD}, \text{LCM}, \text{ModPow}, \text{Sieve}$

Operational Behavior

Prefix/Suffix Sums. Constructs an auxiliary array in $O(N)$ time where each element $i$ is the sum of elements from $0$ to $i$ . Enables $O(1)$ range sum queries. Mathematically, this represents a discrete integration of the input sequence. For a 2D matrix $A$ , the system extends this logic using the Inclusion-Exclusion Principle to calculate the sum of any sub-rectangle in constant time. This allows the compiler to optimize multi-dimensional data lookups.
Two Pointers. Maintains two index variables (usually lo and hi) to solve problems like "Finding Pair with Sum X" in $O(N)$ on a sorted array. The technique relies on the monotonicity of the underlying data structure. The system ensures that the pointers move in a single direction without backtracking, which mathematically guarantees that each element is processed at most twice. This provides a linear bound on the execution time.
Monotonic Data Structures. Uses a stack or deque to maintain a sorted subset of elements (e.g., indices of the next greater element), solving range minimum or maximum problems in $O(N)$ . These structures enforce an invariant property where elements are always stored in either increasing or decreasing order. When a new element violates the order, the system removes elements from the back until the property is restored. This behavior is essential for solving "Sliding Window Maximum" problems efficiently.
Modular Power. Implements binary exponentiation to calculate $(a^b) \mod m$ in $O(\log b)$ time, preventing overflow by applying the modulo at each step. The implementation uses a Divide and Conquer approach based on the relationship $a^b = (a^{b/2})^2$ for even exponents. By reducing the exponent by half in each iteration, the algorithm minimizes the number of multiplications required. This is a fundamental primitive for cryptographic and number-theoretic calculations.
Disjoint Set Union (DSU). Manages a collection of non-overlapping sets and provides an efficient way to check if two elements belong to the same component. The system applies path compression and union by rank to keep the tree structures shallow. Mathematically, this results in a nearly constant amortized time complexity per operation. The performance is bounded by the Inverse Ackermann function $\alpha(n)$ , which is less than 5 for all practical values of $n$ .

Implementation Mapping

Implementation details in src/registry/categories/cp_core.ts:

Array Manipulation:
- prefix_sum (Line 9): Uses itertools.accumulate in Python.
- sliding_window (Line 41): Implements a fixed-size sum window.
- inversion_count (Line 73): Nested function implementation using merge sort logic.
Search Patterns:
- lower_bound/upper_bound (Lines 83-91): Direct STL/Library mapping.
- parametric_search (Line 107): Standard while(lo <= hi) binary search on values.
Mathematical primitives:
- mod_pow (Line 168): Custom loop in C++/TS, built-in pow(a,b,m) in Python.
- sieve (Line 186): Standard bool-array prime marking.

Mathematical Model

The Prefix Sum relationship is defined as: $P_i = \sum_{j=0}^{i-1} A_j$ Range sum query for $[L, R]$ (0-indexed, inclusive) becomes: $\text{Sum}(L, R) = P_{R+1} - P_L$

The GCD (Euclidean Alg) satisfies: $\gcd(a, b) = \gcd(b, a \mod b), \gcd(a, 0) = a$

Complexity

Linear Time ( $O(N)$ ): Prefix sums, two pointers, sliding window, monotonic stack/queue.
Logarithmic Time ( $O(\log N)$ ): Binary search, lower/upper bound, modular exponentiation, GCD.
Log-Linear Time ( $O(N \log N)$ ): Coordinate compression, inversion count, interval merging.

Examples

Example 1: Range Sum Inference

ARES Source:

ares
read nums as vector<int>
use prefix_sum on nums
print pref[3] - pref[0]

Emitted C++:

cpp
// 1. Reading high-level container
int n_nums; cin >> n_nums;
vector<long long> nums(n_nums);
for(int i=0; i<n_nums; ++i) cin >> nums[i];

// 2. Prefix sum intent realization
vector<long long> pref(nums.size() + 1, 0);
for(int i=0; i<(int)nums.size(); ++i) pref[i+1] = pref[i] + nums[i];

// 3. Constant time query
cout << pref[3] - pref[0] << endl;

Example 2: Binary Search on Answer

ARES Source:

ares
use parametric_search on ans with lo=0, hi=1000000

Emitted C++:

cpp
long long ans = -1;
long long lo = 0, hi = 1000000;
while (lo <= hi) {
    long long mid = lo + (hi - lo) / 2;
    if (check(mid)) { // check() is inferred from surrounding logic
        ans = mid;
        hi = mid - 1;
    } else {
        lo = mid + 1;
    }
}

Traces

Trace for use sliding_window on data with k=3:

Emitter identifies k=3 from parameters.
C++ template injected: initializes windowSum = 0, best = 0.
Loop runs from $0$ to $N$ .
Adds data[i], subtracts data[i-3] when $i \ge 3$ .
Updates best with std::max.

Edge Cases

Empty Arrays: All array-based core approaches check {{target}}.size() or equivalent, though a zero-length array may result in uninitialized variable errors in the emitted code if not guarded.
Modulo Zero: mod_pow and mod_inverse assume a positive non-zero modulo; division by zero is not handled at the emitter level.
Integer Overflow: C++ templates use long long for sums to prevent overflow on large input values.

Failure Modes

Non-Sorted Binary Search: Using lower_bound or binary_search on an unsorted array without the --inference flag will produce incorrect runtime results.
Memory Limit: Large sieves (e.g., $N > 10^8$ ) may exceed runtime memory limits ( $>100$ MB).

Compatibility

Multi-Language: Identical results for GCD, sieve, and sequence processing across all backends.

Testing

Invariant Verification: Tests in tests/core_approaches.test.ts verify that prefix_sum followed by a custom query matches hand-calculated sums.
Boundary Tests: Verifying sifting and searching on arrays of size $0, 1, 2$ .

Related Entities

8_registry_approaches/02_cp_advanced.md: Builds upon these primitives for graph and string algorithms.
src/semantics/analyzer.ts: Automatically prepends sort when searching.

Source Attribution

Implementation: src/registry/categories/cp_core.ts.
Algorithms: Derived from standard CP resources (USACO, Codeforces EDU).