LLM Integration¶
Drift uses Ollama for fully local LLM inference. No API keys, no cloud — your queries never leave your machine.
How It Works¶
sequenceDiagram
participant User
participant CLI
participant Ollama
participant Safety
User->>CLI: drift suggest "find large files"
CLI->>CLI: Gather context (cwd, git, user)
CLI->>Ollama: POST /api/generate (JSON mode)
Ollama->>Ollama: Local inference
Ollama->>CLI: JSON response
CLI->>CLI: Pydantic validation
CLI->>Safety: Validate commands
Safety->>CLI: Risk assessment
CLI->>User: Display plan with risk badge
System Prompt¶
Drift sends a structured system prompt that forces JSON output:
Role: You are Drift, a terminal assistant.
Task: Convert natural language to shell commands.
Format: Strict JSON matching the Plan schema.
Rules:
- Assess risk accurately
- Use clarification_needed for ambiguous queries
- Provide dry-run versions when possible
- Never suggest dangerous root operations
JSON Schema¶
The LLM must respond with JSON matching this schema:
{
"summary": "Brief description of the plan",
"risk": "low | medium | high",
"commands": [
{
"command": "the shell command",
"description": "what this command does",
"dry_run": "optional safe preview version"
}
],
"explanation": "Detailed explanation",
"affected_files": ["list", "of", "files"],
"clarification_needed": [
{
"question": "What directory?",
"options": ["/home", "/tmp"]
}
]
}
Responses are validated with Pydantic. If the JSON is malformed or missing required fields, Drift retries with exponential backoff.
Supported Models¶
Any Ollama model works. Recommended:
| Model | Size | Quality | Speed |
|---|---|---|---|
qwen2.5-coder:1.5b |
~1GB | Good | Fast |
qwen2.5-coder:7b |
~4.5GB | Better | Medium |
codellama:7b |
~4GB | Good | Medium |
deepseek-coder:6.7b |
~4GB | Good | Medium |
Context Awareness¶
Before querying the LLM, Drift gathers context:
- Current directory and its contents
- Git status (branch, modified files, remote URL)
- User info (username, shell)
- Project type (detected from files like
package.json,pyproject.toml) - Memory (learned preferences from past interactions)
This context is injected into the prompt so the LLM generates commands appropriate for your environment.
Input Sanitization¶
User queries are sanitized before being sent to the LLM:
- Strip control characters
- Limit query length
- Escape potential prompt injection attempts
Error Handling¶
| Scenario | Behavior |
|---|---|
| Ollama not running | Shows error, suggests ollama serve |
| Model not pulled | Shows error, suggests ollama pull |
| Invalid JSON response | Retries up to 3 times |
| Network timeout | Retries with exponential backoff |
| Ambiguous query | LLM returns clarification_needed |