Semantic search over the wiki — natural-language queries with vector retrieval and full-text fallback, freshness-boosted, and federated across repos in workspace mode.

Discovery tool. When get_answer returns low confidence — or when the agent wants to enumerate candidate pages on a topic before committing to one — search_codebase returns a ranked list of wiki pages with snippets and relevance scores.

When to call

After get_answer when confidence is medium or low — explore the alternatives.
Topic enumeration — find every page that mentions "caching", "rate limiting", "websocket", etc.
When a question is too broad to answer — list the surface, then pick what to drill into with get_context.

Parameters

Prop

Type

Returns

Field	Description
`results`	Ranked list of result objects

Each result contains:

page_id, title, page_type
snippet — excerpt from the page body
relevance_score — raw 0–10 score from the underlying store
confidence_score — normalized 0–1 relative to the top result
repo — (workspace mode only) which repo it came from

Example

search_codebase("authentication flow")
search_codebase("database migrations", limit=10, page_type="file_page")
search_codebase("caching strategies", repo="all")

Things worth knowing

Vector + FTS fallback — semantic search runs first with an 8-second timeout. If the vector store is unavailable or slow, full-text search takes over. The agent doesn't need to know which fired.
Fetch-and-filter — fetches 3× the requested limit, then filters by page_type and a minimum relevance threshold (0.03). Keeps results rank-stable when filters are applied.
Freshness boost — results are re-ranked by git activity. Files with commits in the last 30 days get a 1.0× boost, 60–90 days get 0.5×, inactive get 0.0×. Recently-touched code wins ties.
Workspace-wide (repo="all") — uses Reciprocal Rank Fusion (k=60) to merge per-repo results. Confidence scores are renormalized within the merged set.
relevance_score vs confidence_score — relevance_score is raw and absolute; confidence_score is normalised against the top result and is what to display or threshold on.

get_answer already runs a search internally before synthesizing. Reach for search_codebase only when you specifically want the list, not the answer.

search_codebase

Semantic search over the wiki — natural-language queries with vector retrieval and full-text fallback, freshness-boosted, and federated across repos in workspace mode.

When to call

After get_answer when confidence is medium or low — explore the alternatives.
Topic enumeration — find every page that mentions "caching", "rate limiting", "websocket", etc.
When a question is too broad to answer — list the surface, then pick what to drill into with get_context.

Parameters

Prop

Type

Returns

Field	Description
`results`	Ranked list of result objects

Each result contains:

page_id, title, page_type
snippet — excerpt from the page body
relevance_score — raw 0–10 score from the underlying store
confidence_score — normalized 0–1 relative to the top result
repo — (workspace mode only) which repo it came from

Example

search_codebase("authentication flow")
search_codebase("database migrations", limit=10, page_type="file_page")
search_codebase("caching strategies", repo="all")

Things worth knowing

Vector + FTS fallback — semantic search runs first with an 8-second timeout. If the vector store is unavailable or slow, full-text search takes over. The agent doesn't need to know which fired.
Fetch-and-filter — fetches 3× the requested limit, then filters by page_type and a minimum relevance threshold (0.03). Keeps results rank-stable when filters are applied.
Freshness boost — results are re-ranked by git activity. Files with commits in the last 30 days get a 1.0× boost, 60–90 days get 0.5×, inactive get 0.0×. Recently-touched code wins ties.
Workspace-wide (repo="all") — uses Reciprocal Rank Fusion (k=60) to merge per-repo results. Confidence scores are renormalized within the merged set.
relevance_score vs confidence_score — relevance_score is raw and absolute; confidence_score is normalised against the top result and is what to display or threshold on.

get_answer already runs a search internally before synthesizing. Reach for search_codebase only when you specifically want the list, not the answer.

search_codebase

When to call

Parameters

Returns

Example

Things worth knowing

On this page

search_codebase

When to call

Parameters

Returns

Example

Things worth knowing

On this page