Greptile Architecture

Greptile is an AI-powered codebase intelligence platform that provides context-aware code review, semantic search, and natural language querying over entire codebases. Built on graph-based RAG (Retrieval Augmented Generation), it goes beyond simple diff-based review tools by constructing a comprehensive knowledge graph of every code entity and its relationships.


Table of Contents

  1. High-Level System Architecture
  2. Indexing Pipeline
  3. Code Review Agent Lifecycle
  4. Query & Search Flow
  5. Integration Architecture
  6. State Management
  7. Security & Data Flow

High-Level System Architecture

Greptile's architecture is organized into four primary subsystems: Ingestion, Intelligence, Delivery, and Integrations. The Ingestion layer handles repository cloning and graph construction. The Intelligence layer provides the AI-powered analysis engine. The Delivery layer exposes results through multiple interfaces. The Integrations layer connects Greptile to the broader developer toolchain.

The system is designed around a separation of concerns where the computationally expensive indexing work happens asynchronously, while queries and reviews operate against the pre-built graph for low-latency responses. Hatchet serves as the workflow orchestration engine, managing the complex, resource-intensive indexing pipelines with proper memory management, durability, and fair scheduling across concurrent users.

flowchart TB subgraph Sources["Source Control Platforms"] GH[GitHub / GitHub Enterprise] GL[GitLab / GitLab Self-Hosted] BB[Bitbucket] end subgraph Ingestion["Ingestion Layer"] RC[Repository Cloner] AST[AST Parser] DG[Docstring Generator] EMB[Embedding Engine] GM[Graph Mapper] HT[Hatchet Workflow Engine] RC --> AST AST --> DG DG --> EMB EMB --> GM HT -.->|orchestrates| RC HT -.->|orchestrates| AST HT -.->|orchestrates| DG HT -.->|orchestrates| EMB HT -.->|orchestrates| GM end subgraph Storage["Persistence Layer"] VDB[(Vector Store
Embeddings)] CG[(Code Graph
Entities & Relations)] Cache[(Prompt Cache
~90% Hit Rate)] end subgraph Intelligence["AI Intelligence Layer"] subgraph Agent["Claude Agent SDK (v3)"] Loop[Agent Loop] CS[Codebase Search Tool] GHist[Git History Tool] Rules[Learned Rules Tool] DT[Dependency Tracer] end SemSearch[Semantic Search Engine] KWSearch[Keyword Search Engine] AgSearch[Agentic Search Engine] end subgraph Delivery["Delivery Layer"] API[REST API
api.greptile.com/v2] WebApp[Web App
app.greptile.com] CLI[CLI
npm i -g greptile] MCP[MCP Server] PRBot[PR Review Bot] end subgraph Integrations["External Integrations"] Slack[Slack] Jira[Jira] Linear[Linear] Notion[Notion] GDrive[Google Drive] Sentry[Sentry] Datadog[Datadog] end Sources -->|clone / pull| Ingestion GM --> VDB GM --> CG CG --> Intelligence VDB --> Intelligence Cache --> Agent Intelligence --> Delivery Delivery --> Integrations Integrations -->|context via MCP| Intelligence

Component Responsibilities

Component Role
Repository Cloner Fetches repository content from source control via API
AST Parser Parses every file to extract functions, classes, variables, files, directories
Docstring Generator Recursively generates semantic summaries for each AST node
Embedding Engine Converts docstrings into vector representations for similarity search
Graph Mapper Builds the relational graph connecting all entities via calls, imports, dependencies
Hatchet Orchestrates the indexing workflow with durability, fair scheduling, and memory management
Claude Agent SDK Powers the autonomous review agent in v3 with multi-hop reasoning
MCP Server Exposes Greptile capabilities to AI agents and IDEs via Model Context Protocol

Indexing Pipeline

The indexing pipeline transforms a raw repository into a queryable knowledge graph. This is the most computationally intensive part of the system and runs asynchronously. Small repositories complete in 3-5 minutes, while large codebases (Linux kernel, CPython) can take over an hour.

The pipeline follows a strict sequential flow because each stage depends on the output of the previous one. AST parsing must complete before docstrings can be generated, docstrings must exist before embeddings can be computed, and all entities must be identified before relationships can be mapped.

Greptile's cloud product does not permanently store customer code. After parsing and embedding, only the derived artifacts (graph, vectors, docstrings) are retained. When code snippets are needed during reviews or queries, they are pulled on-demand from the source control API.

flowchart LR subgraph Trigger["Trigger"] API_Call["POST /repositories"] Webhook["PR Webhook"] Schedule["Re-index Schedule"] end subgraph Pipeline["Indexing Pipeline (Hatchet-Orchestrated)"] direction TB Clone["1. Clone Repository
via GitHub/GitLab API"] Parse["2. AST Parsing
Extract entities per file"] Docstr["3. Recursive Docstring Gen
Summarize each AST node"] Embed["4. Embedding Generation
Vectorize docstrings"] Graph["5. Relationship Mapping
Connect calls, imports, deps"] Persist["6. Graph Persistence
Store to graph DB + vector DB"] end subgraph Outputs["Indexed Artifacts"] Entities["Entity Catalog
functions, classes, vars"] Vectors["Vector Index
semantic embeddings"] Relations["Relationship Graph
calls, imports, deps"] Meta["Metadata
commit SHA, timestamps"] end Trigger --> Clone Clone --> Parse Parse --> Docstr Docstr --> Embed Embed --> Graph Graph --> Persist Persist --> Entities Persist --> Vectors Persist --> Relations Persist --> Meta

Entities and Relationships Extracted

During parsing, Greptile extracts a rich set of code entities and maps their interconnections:

This graph structure is what differentiates Greptile from simpler RAG-on-code approaches. Codebases are interconnected graphs, not collections of standalone documents. A change to one function may have cascading effects through callers, dependents, and related patterns across the entire repository. The graph captures these connections, enabling the review agent to trace impact paths that linear search would miss.


Code Review Agent Lifecycle

The code review agent is Greptile's flagship feature. In v3, it is built on the Anthropic Claude Agent SDK and operates as a fully autonomous investigator rather than following a rigid, predetermined flowchart.

When a pull request is opened, the agent receives the diff and begins an investigation loop. At each step, the agent decides what to explore next based on what it has learned so far. It might start by examining the changed functions, then trace their callers, check git history to understand why a pattern exists, compare against similar functions elsewhere in the codebase, and verify that the change doesn't break any contracts with dependent code.

The agent has a high limit on inference calls and tool invocations, enabling deep, recursive investigation. This is the key architectural shift from v2 (which used a rigid flowchart) to v3 (which uses autonomous, multi-hop reasoning). The result is a 70.5% higher acceptance rate and the ability to catch 3x more critical bugs.

Prompt caching plays a critical role in cost efficiency. With cache hit rates approaching 90%, v3 actually achieves 75% lower inference costs for self-hosted deployments despite using approximately 3x more context tokens than v2.

sequenceDiagram participant Dev as Developer participant SCM as GitHub / GitLab participant Bot as PR Review Bot participant Agent as Claude Agent (v3) participant Graph as Code Graph participant Git as Git History participant Rules as Learned Rules participant Cache as Prompt Cache Dev->>SCM: Opens Pull Request SCM->>Bot: Webhook notification Bot->>Agent: Initialize review (diff + metadata) loop Autonomous Investigation Loop Agent->>Cache: Check prompt cache Cache-->>Agent: Cached context (~90% hit rate) Agent->>Graph: Search codebase for context Graph-->>Agent: Related entities & dependencies Agent->>Git: Examine commit history Git-->>Agent: Historical context & rationale Agent->>Rules: Check team rules & learnings Rules-->>Agent: Applicable rules & patterns Agent->>Agent: Analyze findings, decide next step Note over Agent: Multi-hop reasoning:
Each step informs the next.
No rigid flowchart. end Agent->>Bot: Generate review output Bot->>SCM: Post PR summary Bot->>SCM: Post inline comments with suggestions Bot->>SCM: Post sequence diagrams (if applicable) Dev->>SCM: React 👍/👎 to comments SCM->>Rules: Update learned rules from feedback Dev->>SCM: Accept quick fix suggestion SCM->>SCM: Apply fix to PR

Investigation Capabilities

During each iteration of the loop, the agent can perform any combination of:

Action Description
Codebase Search Semantic and keyword search across the entire indexed repository
Dependency Tracing Follow function calls, imports, and usage sites to understand impact
Git History Analysis Examine why code was written a certain way by reviewing commits and PRs
Pattern Comparison Compare changed code against similar functions elsewhere
Rule Application Apply team-specific rules, style guides, and learned conventions
Impact Assessment Identify all locations affected by the change across the codebase

Review Output

The agent produces several types of output:


Query & Search Flow

When a user asks a question about their codebase (via the web app, CLI, Slack, or API), Greptile employs a multi-modal search strategy. Rather than relying solely on vector similarity (which misses structural relationships) or keyword matching (which misses semantic intent), Greptile combines three search modalities and then applies an agentic layer to validate and enrich results.

The semantic search finds conceptually related code even when the terminology differs. The keyword search catches exact matches that vector search might rank lower. The agentic search adds a reasoning layer that traces references, follows call chains, and evaluates whether results are truly relevant to the query. This combination is what makes Greptile's RAG approach work for codebases, where simple document-style RAG falls short.

Session IDs enable multi-turn conversations where context accumulates across questions, allowing developers to progressively drill deeper into a topic without repeating context.

sequenceDiagram participant User as Developer participant API as Greptile API participant Sem as Semantic Search participant KW as Keyword Search participant Ag as Agentic Search participant VDB as Vector Store participant CG as Code Graph participant LLM as LLM (Claude) User->>API: POST /query (natural language question) API->>API: Parse query & identify repositories par Multi-Modal Search API->>Sem: Semantic similarity search Sem->>VDB: Query vector embeddings VDB-->>Sem: Top-k similar docstrings Sem-->>API: Semantic results API->>KW: Keyword search KW->>CG: Text match against entities CG-->>KW: Matching entities KW-->>API: Keyword results end API->>Ag: Agentic search (validate & enrich) Ag->>CG: Trace references & connections CG-->>Ag: Related entities & call chains Ag->>Ag: Evaluate relevance of all results Ag-->>API: Curated & enriched results API->>LLM: Generate answer with context LLM-->>API: Natural language answer + references API-->>User: Answer + relevant files, functions, classes

Integration Architecture

Greptile's integration architecture is built around the Model Context Protocol (MCP), which provides a standardized interface for AI agents and external tools to interact with Greptile's codebase intelligence.

The MCP server operates in two modes: a traditional MCP mode for direct client integration (used by IDEs and coding agents) and an HTTP mode providing a JSON-RPC 2.0 interface for web applications and REST clients. This dual-mode design ensures compatibility with both the emerging MCP ecosystem and traditional integration patterns.

For inbound context, Greptile uses MCP to pull information from Jira, Notion, and Google Drive during reviews. This means the review agent can understand not just the code but also the ticket requirements, design documents, and team decisions that motivated the change. For outbound delivery, Greptile pushes results through GitHub/GitLab webhooks, Slack messages, and its REST API.

flowchart TB subgraph InboundContext["Inbound Context (MCP)"] Jira[Jira
Ticket context] Notion[Notion
Design docs] GDrive[Google Drive
Specs & guides] end subgraph GreptileCore["Greptile Core"] MCPIn[MCP Client
pulls external context] Engine[Intelligence Engine] MCPOut[MCP Server
exposes tools] RESTAPI[REST API
v2 endpoints] end subgraph OutboundDelivery["Outbound Delivery"] subgraph DirectIntegrations["Direct Integrations"] GHBot[GitHub PR Bot] GLBot[GitLab PR Bot] SlackBot[Slack Bot] end subgraph DeveloperInterfaces["Developer Interfaces"] WebApp[Web App] CLI[CLI Tool] IDE[IDE Agents
via MCP] end subgraph CustomBuilt["Custom-Built (via API)"] SentryInt[Sentry Diagnosis] DatadogInt[Datadog Enrichment] DocGen[Doc Generators] CustomBots[Custom Bots] end end InboundContext -->|context during review| MCPIn MCPIn --> Engine Engine --> MCPOut Engine --> RESTAPI MCPOut --> IDE RESTAPI --> DirectIntegrations RESTAPI --> DeveloperInterfaces RESTAPI --> CustomBuilt

MCP Server Tools

The MCP server exposes four primary tools to AI agents:

Tool Description
index_repository Submit a repository for indexing
query_repository Ask natural language questions about indexed code
search_repository Search for specific code entities
get_repository_info Retrieve repository indexing status and metadata

State Management

The state diagram below captures the lifecycle of a repository within Greptile and the transitions triggered by user actions, webhooks, and system events. Understanding these states is important because the availability of features (querying, reviewing) depends on the repository being in the Indexed state.

A repository begins as Unregistered — Greptile has no knowledge of it. When a user submits it via the API or dashboard, it transitions to Queued and then through the indexing pipeline stages. If indexing succeeds, the repository enters the Indexed state where all features become available. If it fails, the user can retry.

Once indexed, repositories are kept current through two mechanisms: webhook-triggered re-indexing when PRs are opened (ensuring the review agent has the latest context) and periodic scheduled re-indexing to catch direct pushes or other changes.

The Reviewing state is transient — it represents the period during which the autonomous agent is actively investigating a PR. Once the review is posted, the repository returns to the Indexed state.

stateDiagram-v2 [*] --> Unregistered Unregistered --> Queued: POST /repositories state IndexingPipeline { Queued --> Cloning: Worker picks up job Cloning --> Parsing: Clone complete Parsing --> Generating: AST parsed Generating --> Embedding: Docstrings generated Embedding --> Mapping: Vectors computed Mapping --> Persisting: Relationships mapped } Persisting --> Indexed: Pipeline complete Persisting --> Failed: Error in pipeline Failed --> Queued: Retry submitted Indexed --> Queued: Re-index triggered Indexed --> Reviewing: PR webhook received state ReviewCycle { Reviewing --> Investigating: Agent starts loop Investigating --> Investigating: Multi-hop iteration Investigating --> Posting: Investigation complete } Posting --> Indexed: Review posted to PR Indexed --> Stale: Repository diverged Stale --> Queued: Scheduled re-index state QueryReady { Indexed --> Querying: User asks question Querying --> Indexed: Answer returned }

State Descriptions

State Description
Unregistered Repository unknown to Greptile
Queued Submitted for indexing, waiting for a worker
Cloning Fetching repository content from source control
Parsing Extracting code entities via AST analysis
Generating Creating docstrings for each AST node
Embedding Computing vector representations of docstrings
Mapping Building the relationship graph between entities
Persisting Writing graph and vectors to storage
Indexed Fully indexed and ready for queries and reviews
Failed Indexing pipeline encountered an error
Reviewing PR review agent is actively investigating
Investigating Agent is in its autonomous multi-hop loop
Posting Agent has completed analysis and is posting results
Stale Repository has diverged from the indexed version
Querying A user query is being processed

Security & Data Flow

Greptile's security architecture is designed around the principle of minimal data retention. The cloud product does not store customer code on its servers. Instead, it stores only the derived artifacts: the knowledge graph, vector embeddings, and generated docstrings. When actual code snippets are needed during a review or query, they are pulled on-demand from the source control API using the customer's access tokens.

For enterprise deployments, Greptile offers a fully self-hosted option where the entire system runs within the customer's AWS infrastructure. This includes the ability to bring your own LLM provider, eliminating any external data flow. The self-hosted architecture achieves cost efficiency through prompt caching, with cache hit rates near 90% translating to 75% lower inference costs despite using 3x more context tokens than the previous version.

SOC 2 Type II compliance, annual external audits, and penetration testing provide the governance framework. SSO/SAML integration, custom DPAs, and GitHub Enterprise support round out the enterprise security story.

flowchart LR subgraph CustomerInfra["Customer Infrastructure"] Dev[Developer] SCM[GitHub / GitLab] end subgraph GreptileCloud["Greptile Cloud"] API[API Gateway] Agent[Review Agent] GraphDB[(Code Graph)] VectorDB[(Vector Store)] CacheLayer[(Prompt Cache)] end subgraph SelfHosted["Self-Hosted Option"] SHAgent[Review Agent] SHLLM[Customer's LLM] SHGraph[(Code Graph)] SHVector[(Vector Store)] end Dev -->|"API key + GitHub PAT"| API SCM -->|"Webhooks"| API API -->|"Pull snippets on demand
(no code stored)"| SCM API --> Agent Agent --> GraphDB Agent --> VectorDB Agent --> CacheLayer Dev -.->|"Enterprise: self-hosted"| SHAgent SHAgent --> SHLLM SHAgent --> SHGraph SHAgent --> SHVector

Summary

Greptile's architecture reflects a core insight: codebases are graphs, not documents. By investing heavily in the indexing pipeline to build a comprehensive knowledge graph, Greptile enables a fundamentally different kind of code review — one where the AI agent can autonomously investigate, trace dependencies, examine history, and apply team-specific rules with full codebase context.

The v3 architecture, built on the Anthropic Claude Agent SDK, represents a shift from rigid orchestration to autonomous reasoning. Each investigation step generates new information that informs the next step, enabling the kind of multi-hop analysis that catches bugs hidden in the interactions between distant parts of a codebase. Combined with near-90% prompt cache hit rates and an MCP-based integration layer, Greptile delivers deep codebase intelligence through a growing ecosystem of developer touchpoints.