The watcher exposes a lightweight HTTP API for search, metadata enrichment, and operational control.
Default address: http://127.0.0.1:1936 (configurable via api.host and api.port)
Health check and service stats.
curl http://localhost:1936/status
{
"status": "ok",
"uptime": 86400,
"version": "0.9.9",
"collection": {
"name": "jeeves_archive",
"pointCount": 10498,
"dimensions": 3072
},
"reindex": {
"active": false
},
"initialScan": {
"active": true,
"filesMatched": 156576,
"filesEnqueued": 156576,
"startedAt": "2026-03-15T01:22:00.000Z"
}
}
| Field | Type | Description |
|---|---|---|
status |
string |
Always "ok" if the server is responding. |
uptime |
number |
Process uptime in seconds. |
version |
string |
Service package version (from package.json). |
collection |
object |
Qdrant collection stats. |
collection.name |
string |
Collection name. |
collection.pointCount |
number |
Total indexed points (chunks). |
collection.dimensions |
number |
Embedding vector dimensions. |
reindex |
object |
Active reindex status. |
reindex.active |
boolean |
Whether a reindex is currently running. |
reindex.scope |
string? |
Reindex scope if active: "rules", "full", "issues", "path", or "prune". |
initialScan |
object |
Initial filesystem scan state after service start. |
initialScan.active |
boolean |
Whether the initial scan is still in progress. |
initialScan.filesMatched |
number? |
Files matched by watch globs so far. |
initialScan.filesEnqueued |
number? |
Files enqueued for processing so far. |
initialScan.startedAt |
string? |
ISO-8601 timestamp when the scan started. |
initialScan.completedAt |
string? |
ISO-8601 timestamp when the scan completed (when active is false). |
initialScan.durationMs |
number? |
Total scan duration in milliseconds (when active is false). |
reindex.startedAt |
string? |
ISO-8601 timestamp when reindex started, if active. |
v0.5.0 change:
payloadFieldshas been removed from the status response. UsePOST /config/queryorGET /config/schemato discover schema and payload field information.
Status codes:
200 OK — Service is healthyEnrich a document's metadata without re-embedding.
curl -X POST http://localhost:1936/metadata \
-H "Content-Type: application/json" \
-d '{
"path": "D:/projects/readme.md",
"metadata": {
"title": "Project Overview",
"labels": ["documentation", "important"],
"priority": "high"
}
}'
Body schema:
{
path: string; // File path (must be indexed)
metadata: Record<string, unknown>; // Metadata to merge
}
Success (200 OK):
{
"ok": true
}
Validation error (400 Bad Request):
{
"error": "Metadata validation failed",
"details": [
{ "field": "priority", "message": "must be one of: low, medium, high" }
]
}
v0.5.0: Metadata is validated against the schema defined by matched inference rules. If the file matches a rule with a
schemablock, provided metadata must conform to it.
Error (500 Internal Server Error):
{
"error": "Internal server error"
}
.meta.json sidecar fileIf the file isn't indexed yet:
Render a file through the inference rule engine without embedding. Returns the transformed (or passthrough) content along with metadata.
v0.8.0+
curl -X POST http://localhost:1936/render \
-H "Content-Type: application/json" \
-d '{"path": "j:/domains/slack/C0ABC/1234567.json"}'
Body schema:
{
path: string; // File path (must be within watched scope)
}
Success (200 OK):
{
"renderAs": "md",
"content": "---\nchannelName: general\n---\n# Message\nHello world",
"rules": ["slack-message", "json-subject"],
"metadata": {
"domain": "slack",
"entity_type": "message",
"matched_rules": ["slack-message", "json-subject"]
}
}
| Field | Type | Description |
|---|---|---|
renderAs |
string |
Output content type (file extension without dot). Always present. |
content |
string |
Rendered content (from template/render transform) or extracted text (passthrough). |
rules |
string[] |
Names of matched inference rules (diagnostic). |
metadata |
object |
Composed embedding properties from matched rules. |
withCache with the configured TTL.Cache-Control: no-cache header set; bypasses the cache.| Code | Condition |
|---|---|
200 |
Success |
400 |
Missing path field |
403 |
Path is outside watched scope |
404 |
File not found |
422 |
Render/extraction failed |
watch.paths and watch.ignored globsbuildMergedMetadata (same pipeline as indexing)renderAs: rule-declared value → file extension → "txt"Use cases:
Returns schema-derived facet definitions for building search filter UIs.
v0.8.0+
curl http://localhost:1936/search/facets
Success (200 OK):
{
"facets": [
{
"field": "domain",
"type": "string",
"uiHint": "dropdown",
"values": ["email", "jira", "meetings", "slack"],
"rules": ["email-archive", "jira-issue", "meetings-transcript", "slack-message"]
},
{
"field": "priority",
"type": "string",
"uiHint": "dropdown",
"values": ["Critical", "High", "Medium", "Low"],
"rules": ["jira-issue"]
}
]
}
| Field | Type | Description |
|---|---|---|
facets |
Facet[] |
Array of facet definitions. |
Facet:
| Field | Type | Description |
|---|---|---|
field |
string |
Metadata field name. |
type |
string |
JSON Schema type (e.g. "string", "number", "boolean"). |
uiHint |
string |
UI rendering hint (e.g. "dropdown", "tags", "date"). |
values |
unknown[] |
Known values. Uses enum values if declared; otherwise live values from the index. |
rules |
string[] |
Which inference rules define this field. |
mergeSchemasuiHint or enumuiHint)Caching: The schema structure is computed once and cached until inference rules change. Live values from ValuesManager are merged fresh on each request.
Use cases:
Semantic search across indexed documents.
curl -X POST http://localhost:1936/search \
-H "Content-Type: application/json" \
-d '{
"query": "machine learning algorithms",
"limit": 10
}'
Body schema:
{
query: string; // Natural language search query
limit?: number; // Max results (default: 10)
offset?: number; // Skip N results for pagination (default: 0)
filter?: Record<string, unknown>; // Qdrant filter object (optional)
}
Filtered search example:
curl -X POST http://localhost:1936/search \
-H "Content-Type: application/json" \
-d '{
"query": "authentication flow",
"limit": 5,
"filter": {
"must": [{ "key": "domain", "match": { "value": "backend" } }]
}
}'
Paginated search example:
curl -X POST http://localhost:1936/search \
-H "Content-Type: application/json" \
-d '{
"query": "authentication flow",
"limit": 10,
"offset": 20
}'
The filter parameter accepts a native Qdrant filter object. Use POST /config/query or GET /config/schema to discover available payload fields and their types, then construct filters accordingly. Common patterns:
{ "must": [{ "key": "domain", "match": { "value": "email" } }] }{ "must_not": [{ "key": "domain", "match": { "value": "codebase" } }] }must array{ "key": "chunk_text", "match": { "text": "keyword" } } (tokenized)Success (200 OK):
[
{
"id": "uuid-chunk-0",
"score": 0.87,
"payload": {
"file_path": "d:/projects/ml/readme.md",
"chunk_index": 0,
"total_chunks": 3,
"content_hash": "sha256:abc123...",
"chunk_text": "Machine learning is...",
"domain": "projects",
"title": "ML Overview"
}
},
{
"id": "uuid-chunk-1",
"score": 0.82,
"payload": { /* ... */ }
}
]
Error (500 Internal Server Error):
{
"error": "Internal server error"
}
Each result is a Qdrant point:
| Field | Type | Description |
|---|---|---|
id |
string |
Qdrant point ID (deterministic UUID from file path + chunk index). |
score |
number |
Cosine similarity score (0–1, higher is better). |
payload |
object |
Document metadata and chunk info. |
Payload fields:
| Field | Type | Description |
|---|---|---|
file_path |
string |
Normalized file path (forward slashes). |
chunk_index |
number |
Chunk index (0-based). |
total_chunks |
number |
Total chunks for this file. |
content_hash |
string |
SHA-256 hash of extracted text. |
chunk_text |
string |
Text content of this chunk. |
| Custom fields | any |
Metadata from inference rules and POST /metadata. |
limit results (includes all chunks; caller groups by file_path if needed)Filter-only point query without vector search. Returns metadata for points matching a Qdrant filter with cursor-based pagination.
curl -X POST http://localhost:1936/scan \
-H "Content-Type: application/json" \
-d '{
"filter": {
"must": [{ "key": "domain", "match": { "value": "email" } }]
},
"limit": 50
}'
Body schema:
{
filter: Record<string, unknown>; // Qdrant filter object (required)
limit?: number; // Page size (default: 100, max: 1000)
cursor?: string | number; // Opaque cursor from previous response
fields?: string[]; // Payload field projection
countOnly?: boolean; // If true, return { count } instead of points
}
Success (200 OK) — normal scan:
{
"points": [
{ "id": "uuid-chunk-0", "payload": { "file_path": "j:/domains/email/msg.json", "domain": "email" } },
{ "id": "uuid-chunk-1", "payload": { "file_path": "j:/domains/email/msg2.json", "domain": "email" } }
],
"cursor": "next-abc123"
}
Success (200 OK) — count only:
{
"count": 4217
}
Last page (no more results):
{
"points": [],
"cursor": null
}
| Field | Type | Description |
|---|---|---|
points |
ScrolledPoint[] |
Matched points with payload. |
cursor |
string | number | null |
Opaque cursor for next page. null when no more results. |
count |
number |
Total matching points (only when countOnly: true). |
| Code | Condition |
|---|---|
200 |
Success |
400 |
Missing or invalid filter, or limit out of bounds (1–1000) |
500 |
Server error |
filter is a non-null objectlimit to 1–1000 range (rejects out-of-bounds with 400)countOnly: calls Qdrant count() with exact mode and returns { count }scroll() with the filter, limit, cursor, and optional field projectionUse cases:
Difference from POST /search:
/scan does NOT embed a query or compute similarity scores/scan uses cursor-based pagination (efficient for large result sets)/search uses offset-based pagination (suitable for small ranked result sets)Trigger a full reindex of all watched files.
curl -X POST http://localhost:1936/reindex
Success (200 OK):
{
"ok": true,
"filesIndexed": 1234
}
Error (500 Internal Server Error):
{
"error": "Internal server error"
}
watch.paths globsUse cases:
Note: This is a blocking operation — the API returns after all files are processed. For very large corpora, expect long response times.
Rebuild the metadata store from Qdrant payloads.
curl -X POST http://localhost:1936/rebuild-metadata
Success (200 OK):
{
"ok": true
}
Error (500 Internal Server Error):
{
"error": "Internal server error"
}
file_path and enrichment metadata from payload.meta.json sidecar file to the metadata storechunk_index, total_chunks, content_hash, chunk_text)Use cases:
Cost: No embedding API calls — pure data extraction.
Trigger a scoped reindex operation. All responses include a plan object showing blast area.
Body schema:
{
scope?: "issues" | "full" | "rules" | "path" | "prune"; // Default: "issues"
path?: string; // Required when scope is "path"
dryRun?: boolean; // Compute plan without executing (default: false)
}
Examples:
# Rules-only reindex (re-apply inference rules, no re-embedding)
curl -X POST http://localhost:1936/config-reindex \
-H "Content-Type: application/json" \
-d '{"scope": "rules"}'
# Full reindex (re-extract, re-embed, re-upsert all files)
curl -X POST http://localhost:1936/config-reindex \
-H "Content-Type: application/json" \
-d '{"scope": "full"}'
# Re-process only files with embedding failures
curl -X POST http://localhost:1936/config-reindex \
-H "Content-Type: application/json" \
-d '{"scope": "issues"}'
# Re-embed a specific directory
curl -X POST http://localhost:1936/config-reindex \
-H "Content-Type: application/json" \
-d '{"scope": "path", "path": "j:/domains/projects/jeeves-watcher"}'
# Delete orphaned points (files no longer in watch scope)
curl -X POST http://localhost:1936/config-reindex \
-H "Content-Type: application/json" \
-d '{"scope": "prune"}'
# Dry run: preview blast area without executing
curl -X POST http://localhost:1936/config-reindex \
-H "Content-Type: application/json" \
-d '{"scope": "prune", "dryRun": true}'
Success (200 OK) — normal execution:
{
"status": "started",
"scope": "rules",
"plan": {
"total": 148000,
"toProcess": 148000,
"toDelete": 0,
"byRoot": {
"j:/domains": 95000,
"j:/config": 3000
}
}
}
Success (200 OK) — dry run:
{
"status": "dry_run",
"scope": "prune",
"plan": {
"total": 562000,
"toProcess": 0,
"toDelete": 2300,
"byRoot": {
"j:/jeeves/node_modules": 1800,
"j:/jeeves/.bridge": 500
}
}
}
Plan fields:
| Field | Type | Description |
|---|---|---|
total |
number |
Total points (prune) or files (other scopes) examined. |
toProcess |
number |
Items to embed/re-apply rules (0 for prune). |
toDelete |
number |
Points to delete (prune only, 0 for others). |
byRoot |
object |
Counts grouped by watch root prefix. |
The reindex runs asynchronously — the API returns immediately with the plan. Use GET /status to track progress.
Error codes:
| Code | Condition |
|---|---|
200 |
Success (started or dry_run) |
400 |
Invalid scope, missing path for path scope, or prune without vectorStore |
500 |
Server error |
issues (default)Re-processes only files that previously failed embedding (from GET /issues). Use after fixing the root cause of failures (permissions, encoding, file format).
rulesRe-reads file attributes, re-applies current inference rules, updates Qdrant payloads. No re-embedding. Use after editing inference rules.
fullRe-extracts text, re-embeds (new embedding API calls), re-upserts to Qdrant. Use after changing embedding providers, chunk size, or adding watch paths.
pathRe-embeds a specific file or all files under a directory. Validates that the target is within watch scope and not gitignored. Requires the path parameter. Use for targeted re-embedding without a full reindex.
pruneScrolls all Qdrant points, checks each file_path against current watch scope and gitignore, batch-deletes orphaned points. No re-embedding, no file reads — pure Qdrant scroll + local checks + deletes. Use after changing watch.paths, adding gitignore rules, or to clean up stale data (e.g., indexed node_modules).
Note: prune is NOT valid for the config-watch auto-trigger (configWatch.reindex). It can only be triggered via explicit API calls.
When dryRun: true, the plan is computed and returned synchronously without starting the async job. Works for all scopes. Use to preview impact before committing to a large operation.
While a reindex is running, GET /status shows the active reindex state:
{
"reindex": { "active": true, "scope": "rules", "startedAt": "2026-02-24T08:00:00Z" }
}
If reindex.callbackUrl is configured, the watcher sends a POST request to that URL when the reindex completes:
{
"scope": "rules",
"filesProcessed": 1234,
"durationMs": 45000,
"status": "completed"
}
The callback retries with exponential backoff (3 attempts, starting at 1 second).
Returns the current issues file contents: all files that failed to embed, with error details.
v0.5.0+
curl http://localhost:1936/issues
Success (200 OK):
{
"count": 2,
"issues": {
"j:/domains/jira/VCN/issue/WEB-123.json": [
{
"type": "type_collision",
"property": "created",
"rules": ["jira-issue", "frontmatter-created"],
"types": ["integer", "string"],
"message": "Type collision on 'created': jira-issue declares integer, frontmatter-created declares string",
"timestamp": 1771865063
}
],
"j:/domains/email/archive/msg-456.json": [
{
"type": "interpolation_error",
"property": "author_email",
"rule": "email-archive",
"message": "Failed to resolve ${json.from.email}: 'from' is null",
"timestamp": 1771865100
}
]
}
}
| Field | Type | Description |
|---|---|---|
count |
number |
Number of files with issues (not total issue count). |
issues |
object |
Issues keyed by file path. |
| Field | Type | Description |
|---|---|---|
type |
string |
Error category: "type_collision" or "interpolation_error". |
property |
string? |
Property name where the issue occurred. |
rules |
string[]? |
Rule names involved in the issue (for type collisions). |
rule |
string? |
Rule name for single-rule issues (backward compat). |
types |
string[]? |
Declared types for type collision issues. |
message |
string |
Human-readable error message. |
timestamp |
number | string |
Unix timestamp (seconds) or ISO string of last occurrence. |
The issues file is self-healing:
Empty response:
{
"count": 0,
"issues": {}
}
Query the merged virtual configuration document using JSONPath expressions.
curl -X POST http://localhost:1936/config/query \
-H "Content-Type: application/json" \
-d '{
"path": "$.inferenceRules[*].name",
"resolve": ["files", "globals"]
}'
Body schema:
{
path: string; // JSONPath expression
resolve?: string[]; // Resolution scopes: "files", "globals" (default: all)
}
Success (200 OK):
{
"result": ["email-classifier", "meeting-tagger", "project-labeler"]
}
The query runs against a virtual document that merges:
payloadFields from status)See Inference Rules Guide for rule structure details.
Returns the JSON Schema describing the merged virtual document (authored config + runtime state).
v0.5.0+
curl http://localhost:1936/config/schema
Success (200 OK):
Returns a JSON Schema object describing the merged config document shape, including:
description, search, schemas, inferenceRules, etc.)inferenceRules[].values, issues, helper introspection)The schema is generated from Zod using z.toJSONSchema() and describes the queryable surface exposed by POST /config/query.
Example response (excerpt):
{
"type": "object",
"properties": {
"description": { "type": "string" },
"schemas": { "type": "array" },
"inferenceRules": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": { "type": "string" },
"description": { "type": "string" },
"values": {
"type": "object",
"additionalProperties": { "type": "array" }
}
}
}
}
}
}
Tests file paths against inference rules and watch scope without indexing.
v0.5.0+
curl -X POST http://localhost:1936/config/match \
-H "Content-Type: application/json" \
-d '{
"paths": [
"j:/domains/jira/VCN/issue/WEB-123.json",
"j:/domains/slack/C0ABC/1234567.json",
"j:/domains/unknown/file.txt"
]
}'
Body schema:
{
paths: string[]; // File paths to test
}
Success (200 OK):
{
"matches": [
{ "rules": ["jira-issue", "json-subject"], "watched": true },
{ "rules": ["slack-message", "json-participants"], "watched": true },
{ "rules": [], "watched": false }
]
}
| Field | Type | Description |
|---|---|---|
matches |
PathMatch[] |
Match results for each input path (same order). |
PathMatch:
| Field | Type | Description |
|---|---|---|
rules |
string[] |
Ordered list of matching inference rule names. |
watched |
boolean |
Whether the path is within watch scope (matches watch.paths and not in watch.ignored). |
inferenceRules array)watched tests against watch path matchers and ignore patternsrules array means no inference rules match (but file may still be watched)watched: false means the path falls outside watch scope or is excluded by ignore patternsUse cases:
Pre-flight validation of configuration changes without applying them.
curl -X POST http://localhost:1936/config/validate \
-H "Content-Type: application/json" \
-d '{
"config": {
"inferenceRules": [
{
"name": "new-rule",
"description": "Test rule",
"match": { "type": "object" },
"schema": [
{ "properties": { "domain": { "type": "string", "set": "test" } } }
]
}
]
},
"testPaths": ["d:/docs/sample.md"]
}'
Body schema:
{
config?: Record<string, unknown>; // Partial or full config to validate
testPaths?: string[]; // File paths to test rules against
}
When config is partial, rules are merged by name: provided rules replace existing rules with the same name; unmatched existing rules are preserved.
Valid (200 OK):
{
"valid": true,
"testResults": [
{
"path": "d:/docs/sample.md",
"matchedRules": ["new-rule"],
"metadata": { "domain": "test" }
}
]
}
Invalid (400 Bad Request):
{
"valid": false,
"errors": [
{ "path": "inferenceRules[0].match", "message": "Invalid JSON Schema" }
]
}
Validation includes checking that referenced helper files can be loaded.
Atomically validate, write, and reload configuration.
curl -X POST http://localhost:1936/config/apply \
-H "Content-Type: application/json" \
-d '{
"config": {
"inferenceRules": [
{
"name": "new-rule",
"description": "Test rule",
"match": { "type": "object" },
"schema": [
{ "properties": { "domain": { "type": "string", "set": "test" } } }
]
}
]
}
}'
Body schema:
{
config: Record<string, unknown>; // Configuration to apply
}
Success (200 OK):
{
"applied": true,
"reindexTriggered": true,
"scope": "rules"
}
Validation failure (400 Bad Request):
{
"applied": false,
"errors": [
{ "path": "inferenceRules[0].match", "message": "Invalid JSON Schema" }
]
}
POST /config/validate)Register virtual inference rules from an external source.
curl -X POST http://localhost:1936/rules/register \
-H "Content-Type: application/json" \
-d '{
"source": "my-external-system",
"rules": [
{
"name": "external-rule",
"description": "Rule registered from external source",
"match": { "type": "object" },
"schema": [
{ "properties": { "domain": { "type": "string", "set": "external" } } }
]
}
]
}'
Body schema:
{
source: string; // Identifier for the rule source
rules: InferenceRule[]; // Array of inference rules to register
}
Success (200 OK):
{
"ok": true,
"registered": 1
}
Remove all virtual rules from a source.
curl -X DELETE http://localhost:1936/rules/unregister \
-H "Content-Type: application/json" \
-d '{ "source": "my-external-system" }'
Body schema:
{
source: string; // Source identifier to unregister
}
Success (200 OK):
{
"ok": true,
"removed": 1
}
Remove all virtual rules from a named source (path parameter variant).
curl -X DELETE http://localhost:1936/rules/unregister/my-external-system
Success (200 OK):
{
"ok": true,
"removed": 1
}
Delete points from Qdrant matching a filter.
curl -X POST http://localhost:1936/points/delete \
-H "Content-Type: application/json" \
-d '{
"filter": {
"must": [{ "key": "domain", "match": { "value": "obsolete" } }]
}
}'
Body schema:
{
filter: Record<string, unknown>; // Qdrant filter object
}
Success (200 OK):
{
"ok": true
}
Re-apply current inference rules to indexed files matching glob patterns, without re-embedding.
curl -X POST http://localhost:1936/rules/reapply \
-H "Content-Type: application/json" \
-d '{
"globs": ["j:/domains/email/**"]
}'
Body schema:
{
globs: string[]; // Non-empty array of glob patterns
}
Success (200 OK):
{
"matched": 150,
"updated": 148
}
| Field | Type | Description |
|---|---|---|
matched |
number |
Files matching the glob patterns. |
updated |
number |
Files successfully re-processed. |
Use cases:
POST /config-reindex (which re-applies to everything)All endpoints return JSON errors with this schema:
{
"error": "ErrorClassName",
"message": "Human-readable error message"
}
Status codes:
200 OK — Success400 Bad Request — Validation error (invalid scope, missing required field, etc.)500 Internal Server Error — Server-side failure (check logs for details)Current: None. The API is intended for localhost-only access (default host: "127.0.0.1").
For production: Use a reverse proxy (nginx, Caddy) with authentication, or bind to 0.0.0.0 only within a trusted network.
No built-in rate limiting. The embedding provider's rate limit (embedding.rateLimitPerMinute) applies, but the API itself is unbounded.
For high-traffic deployments, add rate limiting at the reverse proxy layer.
Not enabled by default. To enable for browser access, modify the Fastify server initialization (code change required).
curl -X POST http://localhost:1936/search \
-H "Content-Type: application/json" \
-d '{"query": "billing integration", "limit": 5}' \
| jq '.[] | {score, path: .payload.file_path, title: .payload.title}'
Output:
{ "score": 0.91, "path": "d:/projects/billing/spec.md", "title": "Billing API Spec" }
{ "score": 0.87, "path": "d:/meetings/2026-02-15/notes.md", "title": "Billing Discussion" }
...
for file in file1.md file2.md file3.md; do
curl -X POST http://localhost:1936/metadata \
-H "Content-Type: application/json" \
-d "{\"path\": \"$file\", \"metadata\": {\"reviewed\": true}}"
done
curl http://localhost:1936/status | jq '.uptime'
Output: 86400 (uptime in seconds)