Reference
Lookup tables for paths, environment variables, and defaults.
# Paths
Platform Path Purpose Linux ${XDG_CONFIG_HOME:-~/.config}/gitcrawl/config.tomlConfiguration file Linux ${XDG_DATA_HOME:-~/.local/share}/gitcrawl/gitcrawl.dbSQLite database Linux ${XDG_CACHE_HOME:-~/.cache}/gitcrawl/Local runtime caches Linux ${XDG_DATA_HOME:-~/.local/share}/gitcrawl/vectors/Vector store backing embeddings Linux ${XDG_STATE_HOME:-~/.local/state}/gitcrawl/logs/Operational logs macOS ~/Library/Application Support/gitcrawl/config.tomlConfiguration file macOS ~/Library/Application Support/gitcrawl/gitcrawl.dbSQLite database macOS ~/Library/Caches/gitcrawl/Local runtime caches macOS ~/Library/Application Support/gitcrawl/vectors/Vector store backing embeddings macOS ~/Library/Application Support/gitcrawl/logs/Operational logs Legacy installs ~/.config/gitcrawl/Preserved config, database, cache, vectors, logs, and stores until the corresponding new path exists
Existing installs with ~/.config/gitcrawl/config.toml continue to load that config when the new platform config path does not exist. Override the config path with --config <path> or GITCRAWL_CONFIG.
# Environment variables
# Core
Variable Default Used by Purpose GITCRAWL_CONFIGPlatform default config path All commands Override config path GITCRAWL_DB_PATHPlatform default database path All commands Override database path GITCRAWL_TUI_LAYOUTcolumnstuiOverride default wide-screen layout GITHUB_TOKEN(none) syncGitHub API token OPENAI_API_KEY(none) embed, refreshOpenAI API key
# Models
Variable Default Purpose GITCRAWL_SUMMARY_MODELgpt-5.4Summary model (reserved for future commands) GITCRAWL_EMBED_MODELtext-embedding-3-smallOpenAI embedding model GITCRAWL_OPENAI_RETRY_DISABLED(off) Set 1 to disable OpenAI retry/backoff GITCRAWL_OPENAI_BASE_URL / OPENAI_BASE_URLOpenAI default Custom OpenAI endpoint
# GitHub overrides
Variable Default Purpose GITCRAWL_GITHUB_BASE_URL / GITHUB_BASE_URLGitHub default Custom GitHub API endpoint GH_REPO(none) Default repository for compatible local search shapes
# gh shim
gitcrawl gh moved to Octopool. Run octopool login, then use octopool gh ....
# Configuration defaults
Field Default summary_modelgpt-5.4embed_modeltext-embedding-3-smallembed_dimensions1024embedding_basistitle_originalvector_backendexact; turbovec requires Python turbovec and dimensions divisible by 8batch_size (embeddings)64concurrency (embeddings)2tui_default_sortsizetui_default_layoutcolumns
# Clustering defaults
Parameter Default Source --threshold0.80cluster, refresh--cross-kind-threshold0.93cluster, refresh--min-size1cluster, refresh--max-cluster-size40cluster, refresh--k (nearest-neighbor fanout)16cluster, refreshWeak-edge title overlap floor 0.18internal High-confidence edge score 0.90internal Deterministic reference edge score 0.94internal Body-only reference prefix length 240 charsinternal
# TUI defaults
Parameter Default --min-size5--sortsize--layoutcolumnsWorking set limit 500 rowsRefresh interval 15s
Format Where to use textHuman terminal use (default) jsonPipelines, scripts, agents (also via --json) logInternal structured logging output
# Exit codes
0 — success
non-zero — usage error, "not implemented" command, or runtime failure
stderr always carries error messages. stdout is reserved for command output.
# File-System Layout
Linux:
~/.config/gitcrawl/
├── config.toml
└── stores/
└── gitcrawl-store/ # portable-store checkout (optional)
└── data/
└── owner__repo.sync.db
~/.local/share/gitcrawl/
├── gitcrawl.db # SQLite mirror
├── gitcrawl.db-shm # SQLite shared-memory file
├── gitcrawl.db-wal # SQLite write-ahead log
└── vectors/ # vector store backing embeddings
~/.cache/gitcrawl/ # local runtime cache
macOS:
~/Library/Application Support/gitcrawl/
├── config.toml
├── gitcrawl.db # SQLite mirror
├── gitcrawl.db-shm # SQLite shared-memory file
├── gitcrawl.db-wal # SQLite write-ahead log
├── vectors/ # vector store backing embeddings
├── logs/
└── stores/
└── gitcrawl-store/ # portable-store checkout (optional)
└── data/
└── owner__repo.sync.db
~/Library/Caches/gitcrawl/ # local runtime cache
Older docs showed cache/pr; current Gitcrawl stores PR details in SQLite instead of a cache/pr directory.
# See also
Previous Commands reference