← The Momental blog
ESSAY April 25, 2026

Code Intelligence: How Our Coding Agents Know What to Do

Cursor, Copilot, and Claude Code search a codebase with text matching, which misses most of the places that need to change. Momental's Code Intelligence gives agents the full relationship map in milliseconds.

code-intelligencebenchmarksagentsengineering

A common scenario with AI coding tools: you ask Cursor to rename a function. It does the rename in the file you have open. You approve, you ship. A few days later something breaks because a call site in worker/billing.ts was never updated.

This happens because today’s coding agents search the codebase with text matching. They grep for the function name and edit what they find. Anything indirect — a callback, a re-export, a separate implementation in another package — is invisible to them.

Momental’s Code Intelligence replaces text search with a queryable map of every function, every caller, and every downstream dependency. The map is indexed once and ready in milliseconds.

We benchmarked it. The numbers below come from 20 refactor tasks on Momental’s production codebase, plus a cold-start run on microsoft/TypeScript.

Top-line results

+45% AI quality score when building new features · 65% of affected files found when renaming a feature (vs. 3% with standard search) · +53% better architectural decisions when AI knows the team’s prior decisions

Same prompts, same evaluator, two retrieval layers compared.

Benchmark 1 — affected-files coverage on rename tasks

Changing how a function works (its name, its parameters, its return type) usually means updating every call site that depends on it. We ran 20 rename and refactor tasks twice each: once with standard text search (Cursor and Copilot defaults), once with Momental’s relationship map. We measured how many of the correct call sites the agent identified before writing code.

Files correctly identified
Standard text search3%
Momental Code Intelligence65% (↑+62pp)

Standard search catches the open file and a handful of obvious matches. It typically misses:

  • Call sites that use the function indirectly (passed as a callback, re-exported, aliased on import)
  • Implementations in other packages of a monorepo
  • Test files that exercise the changed code path

Sample tasks from the benchmark:

  • Add a required ttlSeconds param to claimFiles()
  • Rename getBlastRadius() return field riskLevel from string to a literal union
  • Add required includeExternal param to the impact-analysis tool
  • Make searchSymbols() return paginated results
  • Add environment parameter to saveRecord()

These are the kind of refactors that break production weeks later, when a missed caller finally executes.

A worked example

Task: add a required parameter to a billing function and find every call site. Run on Momental’s production codebase, 1,130 source files.

Search: "recordAsync(" across all files
→ 37 raw lines
→ 3.8s – 10s
→ manual triage required

The agent gets a flat list of file:line matches. One line is the function definition. Multiple are different implementations of the same name in different packages, mixed together. No risk level, no test suggestions, no grouping. The agent either writes blind or spends another twenty tool calls disambiguating each match.

With Momental — code relationship map

Look up "recordAsync" in code map
→ 32 named callers
→ 422ms total (185ms lookup + 237ms callers)
→ risk level + tests + grouping included

Structured response:

  • Critical — 441 affected files downstream
  • api: GeminiService · 9 call sites · 8 methods
  • api: ClaudeAdapter.streamWithMessages · line 797
  • api: OpenAIAdapter · 3 call sites
  • api: EmbeddingService.embed · EmbeddingService.embedTextBatch
  • worker: separate implementation, 9 callers, independent impact
  • Run tests: embedding.test.ts · gemini.service.test.ts · agent12.test.ts

Each result is a named function. The duplicated worker-package implementation is flagged separately. The relevant tests are listed. The agent has what it needs to start editing.

Standard searchMomental
Time3.8 – 10s422ms (9–24× faster)
Output37 lines, 1 is the definition32 named callers
Manual reviewYes (20 files, no grouping)None
Risk levelUnknownCritical (441 files affected downstream)
Tests to runUnknown3 specific test files
Multiple implementationsMixed into a flat list2 separate, each with impact analysis

Measured live in April 2026. Momental’s production codebase, 1,130 source files. Standard search: full-repository scan, no path scoping. Momental: 185ms + 237ms = 422ms total.

External validation: microsoft/TypeScript

To rule out the obvious objection (that Momental might only be fast on a codebase it has been working with for months), we ran the same test on the Microsoft TypeScript compiler. 379,000 lines, cold start, never previously indexed.

Standard searchMomental
Time to find all callers3.8 – 10s~524ms (~20× faster)
Risk level providedNoYes
Specific tests identifiedNoYes

20 of 20 tasks measured in April 2026. 13m 16s total. Microsoft/TypeScript, 601 source files, 379k lines. Momental: ~275ms lookup + ~227ms callers = ~502ms total.

Benchmark 2 — architectural decisions with team history

The other half of the problem is knowing how a team has previously decided to build things: which database pattern to use, how billing is handled, which internal wrapper to use instead of the vendor API. Most of this lives in Slack threads from eight months ago, or in nobody’s head.

We asked the AI five architectural questions from our own engineering work. An independent grader scored each answer on a 1–10 scale.

QuestionWithout team historyWith Momental
Add a database query — raw SQL or the ORM?7.2 / 108.2 / 10 (↑+14%)
New AI call — run inline or queue as a background job?7.5 / 109.0 / 10 (↑+20%)
New AI feature — vendor API directly, or our internal billing-aware wrapper?6.0 / 10 (misses billing rules)9.2 / 10 (↑+53%)
New AI text-extraction feature — how should it be built?6.2 / 10 (misses cost controls)9.0 / 10 (↑+45%)

The largest gains are on questions where the right answer depends on team-specific rules. Generic AI knows the public best practice; it does not know that a team committed months ago to using the internal billing-aware wrapper. Momental surfaces that prior decision and the AI scores 9.2/10 rather than 6.0/10.

Methodology

Each of the 20 tasks runs twice, once with standard search and once with Momental. Same prompts, same model. We measure five dimensions per run, computed fresh from the codebase each time:

  • Complete: did the AI find every place that needs to change?
  • Precise: did it avoid false positives that would send the agent off on unrelated edits?
  • Fast: wall-clock time from prompt to actionable output.
  • Structured: did the response come back grouped, named, and risk-scored, or as a flat list of grep hits?
  • Decision-grade: did the reasoning incorporate prior team decisions, or fall back to public-doc generic advice?

The methodology is published and the runs are reproducible.

What changes for coding agents

Standard text search forces a coding agent to start each task by guessing what the relevant code is. Code Intelligence replaces the guess with a lookup. Downstream behavior changes accordingly: fewer wasted edits, fewer silent breakages, and fewer cases where the model never saw a file that mattered.

Our agents — Sirius, Vega, Vidar — ship code that holds together at the cross-package level. The underlying models are unchanged. They walk into the task with the map already drawn.

Try it

Code Intelligence is available on every Momental tier, including Build. If you connect Claude Code, Codex, or Cursor via MCP, your agent gets the same code map our internal agents use.

Join the waitlist → · See the engine → · Meet the agents →

Final word

Build. Learn. Grow .

World-class growth teams are rare. Momental is how you get one anyway.

Talk to a human