LLM Integration¶

Drift uses Ollama for fully local LLM inference. No API keys, no cloud — your queries never leave your machine.

How It Works¶

sequenceDiagram
    participant User
    participant CLI
    participant Ollama
    participant Safety

    User->>CLI: drift suggest "find large files"
    CLI->>CLI: Gather context (cwd, git, user)
    CLI->>Ollama: POST /api/generate (JSON mode)
    Ollama->>Ollama: Local inference
    Ollama->>CLI: JSON response
    CLI->>CLI: Pydantic validation
    CLI->>Safety: Validate commands
    Safety->>CLI: Risk assessment
    CLI->>User: Display plan with risk badge

System Prompt¶

Drift sends a structured system prompt that forces JSON output:

Role: You are Drift, a terminal assistant.
Task: Convert natural language to shell commands.
Format: Strict JSON matching the Plan schema.
Rules:
  - Assess risk accurately
  - Use clarification_needed for ambiguous queries
  - Provide dry-run versions when possible
  - Never suggest dangerous root operations

JSON Schema¶

The LLM must respond with JSON matching this schema:

{
  "summary": "Brief description of the plan",
  "risk": "low | medium | high",
  "commands": [
    {
      "command": "the shell command",
      "description": "what this command does",
      "dry_run": "optional safe preview version"
    }
  ],
  "explanation": "Detailed explanation",
  "affected_files": ["list", "of", "files"],
  "clarification_needed": [
    {
      "question": "What directory?",
      "options": ["/home", "/tmp"]
    }
  ]
}

Responses are validated with Pydantic. If the JSON is malformed or missing required fields, Drift retries with exponential backoff.

Supported Models¶

Any Ollama model works. Recommended:

Model	Size	Quality	Speed
`qwen2.5-coder:1.5b`	~1GB	Good	Fast
`qwen2.5-coder:7b`	~4.5GB	Better	Medium
`codellama:7b`	~4GB	Good	Medium
`deepseek-coder:6.7b`	~4GB	Good	Medium

Context Awareness¶

Before querying the LLM, Drift gathers context:

Current directory and its contents
Git status (branch, modified files, remote URL)
User info (username, shell)
Project type (detected from files like package.json, pyproject.toml)
Memory (learned preferences from past interactions)

This context is injected into the prompt so the LLM generates commands appropriate for your environment.

Input Sanitization¶

User queries are sanitized before being sent to the LLM:

Strip control characters
Limit query length
Escape potential prompt injection attempts

Error Handling¶

Scenario	Behavior
Ollama not running	Shows error, suggests `ollama serve`
Model not pulled	Shows error, suggests `ollama pull`
Invalid JSON response	Retries up to 3 times
Network timeout	Retries with exponential backoff
Ambiguous query	LLM returns `clarification_needed`