Reference

Reference

Reference

Lookup tables for paths, environment variables, and defaults.

#Paths

PlatformPathPurpose
Linux${XDG_CONFIG_HOME:-~/.config}/gitcrawl/config.tomlConfiguration file
Linux${XDG_DATA_HOME:-~/.local/share}/gitcrawl/gitcrawl.dbSQLite database
Linux${XDG_CACHE_HOME:-~/.cache}/gitcrawl/Local runtime caches
Linux${XDG_DATA_HOME:-~/.local/share}/gitcrawl/vectors/Vector store backing embeddings
Linux${XDG_STATE_HOME:-~/.local/state}/gitcrawl/logs/Operational logs
macOS~/Library/Application Support/gitcrawl/config.tomlConfiguration file
macOS~/Library/Application Support/gitcrawl/gitcrawl.dbSQLite database
macOS~/Library/Caches/gitcrawl/Local runtime caches
macOS~/Library/Application Support/gitcrawl/vectors/Vector store backing embeddings
macOS~/Library/Application Support/gitcrawl/logs/Operational logs
Legacy installs~/.config/gitcrawl/Preserved config, database, cache, vectors, logs, and stores until the corresponding new path exists

Existing installs with ~/.config/gitcrawl/config.toml continue to load that config when the new platform config path does not exist. Override the config path with --config <path> or GITCRAWL_CONFIG.

#Environment variables

#Core

VariableDefaultUsed byPurpose
GITCRAWL_CONFIGPlatform default config pathAll commandsOverride config path
GITCRAWL_DB_PATHPlatform default database pathAll commandsOverride database path
GITCRAWL_TUI_LAYOUTcolumnstuiOverride default wide-screen layout
GITHUB_TOKEN(none)syncGitHub API token
OPENAI_API_KEY(none)embed, refreshOpenAI API key

#Models

VariableDefaultPurpose
GITCRAWL_SUMMARY_MODELgpt-5.4Summary model (reserved for future commands)
GITCRAWL_EMBED_MODELtext-embedding-3-smallOpenAI embedding model
GITCRAWL_OPENAI_RETRY_DISABLED(off)Set 1 to disable OpenAI retry/backoff
GITCRAWL_OPENAI_BASE_URL / OPENAI_BASE_URLOpenAI defaultCustom OpenAI endpoint

#GitHub overrides

VariableDefaultPurpose
GITCRAWL_GITHUB_BASE_URL / GITHUB_BASE_URLGitHub defaultCustom GitHub API endpoint
GH_REPO(none)Default repository for compatible local search shapes

#gh shim

gitcrawl gh moved to Octopool. Run octopool login, then use octopool gh ....

#Configuration defaults

FieldDefault
summary_modelgpt-5.4
embed_modeltext-embedding-3-small
embed_dimensions1024
embedding_basistitle_original
vector_backendexact; turbovec requires Python turbovec and dimensions divisible by 8
batch_size (embeddings)64
concurrency (embeddings)2
tui_default_sortsize
tui_default_layoutcolumns

#Clustering defaults

ParameterDefaultSource
--threshold0.80cluster, refresh
--cross-kind-threshold0.93cluster, refresh
--min-size1cluster, refresh
--max-cluster-size40cluster, refresh
--k (nearest-neighbor fanout)16cluster, refresh
Weak-edge title overlap floor0.18internal
High-confidence edge score0.90internal
Deterministic reference edge score0.94internal
Body-only reference prefix length240 charsinternal

#TUI defaults

ParameterDefault
--min-size5
--sortsize
--layoutcolumns
Working set limit500 rows
Refresh interval15s

#Output formats

FormatWhere to use
textHuman terminal use (default)
jsonPipelines, scripts, agents (also via --json)
logInternal structured logging output

#Exit codes

  • 0 — success
  • non-zero — usage error, "not implemented" command, or runtime failure

stderr always carries error messages. stdout is reserved for command output.

#File-System Layout

Linux:

~/.config/gitcrawl/
├── config.toml
└── stores/
    └── gitcrawl-store/          # portable-store checkout (optional)
        └── data/
            └── owner__repo.sync.db

~/.local/share/gitcrawl/
├── gitcrawl.db                  # SQLite mirror
├── gitcrawl.db-shm              # SQLite shared-memory file
├── gitcrawl.db-wal              # SQLite write-ahead log
└── vectors/                     # vector store backing embeddings

~/.cache/gitcrawl/               # local runtime cache

macOS:

~/Library/Application Support/gitcrawl/
├── config.toml
├── gitcrawl.db                  # SQLite mirror
├── gitcrawl.db-shm              # SQLite shared-memory file
├── gitcrawl.db-wal              # SQLite write-ahead log
├── vectors/                     # vector store backing embeddings
├── logs/
└── stores/
    └── gitcrawl-store/          # portable-store checkout (optional)
        └── data/
            └── owner__repo.sync.db

~/Library/Caches/gitcrawl/       # local runtime cache

Older docs showed cache/pr; current Gitcrawl stores PR details in SQLite instead of a cache/pr directory.

#See also