Reference
Lookup tables for paths, environment variables, and defaults.
#Paths
| Path | Purpose |
|---|
~/.config/gitcrawl/config.toml | Configuration file |
~/.config/gitcrawl/gitcrawl.db | SQLite database |
~/.config/gitcrawl/cache/ | Caches (PR detail, gh-shim fallthrough) |
~/.config/gitcrawl/cache/gh-shim/ | gh-shim fallthrough cache |
~/.config/gitcrawl/vectors/ | Vector store backing embeddings |
~/.config/gitcrawl/logs/ | Operational logs |
~/.config/gitcrawl/portable/ | Portable-store checkout (when configured) |
Override the config root with --config <path> or GITCRAWL_CONFIG.
#Environment variables
#Core
| Variable | Default | Used by | Purpose |
|---|
GITCRAWL_CONFIG | ~/.config/gitcrawl/config.toml | All commands | Override config path |
GITCRAWL_DB_PATH | ~/.config/gitcrawl/gitcrawl.db | All commands | Override database path |
GITHUB_TOKEN | (none) | sync, gh shim | GitHub API token |
OPENAI_API_KEY | (none) | embed, refresh | OpenAI API key |
#Models
| Variable | Default | Purpose |
|---|
GITCRAWL_SUMMARY_MODEL | gpt-5.4 | Summary model (reserved for future commands) |
GITCRAWL_EMBED_MODEL | text-embedding-3-small | OpenAI embedding model |
GITCRAWL_OPENAI_RETRY_DISABLED | (off) | Set 1 to disable OpenAI retry/backoff |
GITCRAWL_OPENAI_BASE_URL / OPENAI_BASE_URL | OpenAI default | Custom OpenAI endpoint |
#GitHub overrides
| Variable | Default | Purpose |
|---|
GITCRAWL_GITHUB_BASE_URL / GITHUB_BASE_URL | GitHub default | Custom GitHub API endpoint |
GH_HOST | (none) | Included in gh-shim cache key |
GH_REPO | (none) | Default -R value; included in gh-shim cache key |
#gh shim
| Variable | Default | Purpose |
|---|
GITCRAWL_GH_PATH | (probed) | Path to the real gh binary |
GITCRAWL_GH_AUTO_HYDRATE | (on) | Set 0 to disable PR auto-hydration on cache miss |
GITCRAWL_GH_CACHE_TTL | 30s for most commands | Override fallthrough cache TTL (e.g., 5m, 1h) |
GITCRAWL_GH_CACHE_ERRORS | (on) | Set 0 to avoid caching non-zero read-only fallthroughs |
#Configuration defaults
| Field | Default |
|---|
summary_model | gpt-5.4 |
embed_model | text-embedding-3-small |
embed_dimensions | 1024 |
embedding_basis | title_original |
batch_size (embeddings) | 64 |
concurrency (embeddings) | 2 |
tui_default_sort | size |
#Clustering defaults
| Parameter | Default | Source |
|---|
--threshold | 0.80 | cluster, refresh |
--cross-kind-threshold | 0.93 | cluster, refresh |
--min-size | 1 | cluster, refresh |
--max-cluster-size | 40 | cluster, refresh |
--k (nearest-neighbor fanout) | 16 | cluster, refresh |
| Weak-edge title overlap floor | 0.18 | internal |
| High-confidence edge score | 0.90 | internal |
| Deterministic reference edge score | 0.94 | internal |
| Body-only reference prefix length | 240 chars | internal |
#TUI defaults
| Parameter | Default |
|---|
--min-size | 5 |
--sort | size |
| Working set limit | 500 rows |
| Refresh interval | 15s |
#gh shim cache TTLs
| Cache class | TTL |
|---|
| Most read-only fallthroughs | 30s |
gh api (GET) | 60s |
gh pr diff without stable head SHA | 5m |
gh pr diff with stable head SHA | 7d |
| Override | GITCRAWL_GH_CACHE_TTL |
| Cache read failures | on by default; disable with GITCRAWL_GH_CACHE_ERRORS=0 |
#gh shim cache key composition
A SHA-256 hash of:
- Version tag (
v2)
- Resolved gitcrawl config path
- Current working directory
GH_HOST env var
GH_REPO env var
- For
gh pr diff: pr-diff:owner/repo:number:head-sha (when head SHA is known)
- Full command argument vector (null-separated)
This isolates sibling checkouts and portable stores while coalescing repeated calls from the same workspace.
| Format | Where to use |
|---|
text | Human terminal use (default) |
json | Pipelines, scripts, agents (also via --json) |
log | Internal structured logging output |
#Exit codes
0 — success
- non-zero — usage error, "not implemented" command, or runtime failure
stderr always carries error messages. stdout is reserved for command output.
#File-system layout (worked example)
~/.config/gitcrawl/
├── config.toml
├── gitcrawl.db # SQLite mirror
├── gitcrawl.db-shm # SQLite shared-memory file
├── gitcrawl.db-wal # SQLite write-ahead log
├── cache/
│ ├── gh-shim/ # gh fallthrough cache; inspect with xcache
│ └── pr/ # hydrated PR detail blobs
├── vectors/ # vector store backing embeddings
├── logs/
└── portable/ # portable-store checkout (optional)
└── data/
└── owner__repo.sync.db
#See also