Orbit Engine Manual

This is the single internal system manual for the Orbit engine. When architecture or runtime behavior changes, update this file in the same PR.

Purpose

Orbit’s engine converts raw application events into compact, high-signal memory that can be retrieved with low latency and high personalization quality.

Core contract:

ingest: accept signals and persist/update memory
retrieve: return ranked, relevant context
feedback: learn from outcomes

Runtime Architecture

High-level components

InputProcessor (stage1_input)
- Cleans input and creates semantic/raw embeddings.
DecisionLogic (stage2_decision)
- Scores the event, assigns decay policy, and decides tier/store behavior.
Storage (SQLiteStorageManager or SQLAlchemyStorageManager)
- Durable memory persistence and indexed candidate search.
RetrievalRanker + retrieval service
- Candidate ranking, intent-aware reweighting, diversity controls.
AdaptivePersonalizationEngine
- Inferred memory generation from repeated patterns/feedback.
LearningLoop
- Feedback-driven updates to ranking/importance/decay learners.

Ingest Path

Ingest is write-oriented and should stay lightweight from the caller perspective.

Flow:

Event -> processed representation.
Decision -> store/discard and tier choice.
Core memory write.
Flash pipeline maintenance (sync or async mode).

Flash Pipeline (Ingest-side maintenance)

Goal: keep database memory clean/compact without blocking ingest-critical work.

Tasks:

conflict/lifecycle checks
inferred pattern generation
cluster compaction
periodic maintenance hooks (decay/recalibration cadence)

Modes

sync (default): maintenance executes inline.
async: maintenance runs in background queue workers.

Config:

MDE_FLASH_PIPELINE_MODE=sync|async
MDE_FLASH_PIPELINE_WORKERS
MDE_FLASH_PIPELINE_QUEUE_SIZE
MDE_FLASH_PIPELINE_MAINTENANCE_INTERVAL

Operational counters are exposed via /v1/metrics.

Retrieve Path

Retrieve is latency-sensitive and the primary runtime bottleneck.

Current strategy:

Semantic candidate preselection.
Learned ranking.
Query-aware reweighting and diversity handling.
Intent caps and inferred-memory probe coverage.
Ranked memory response with provenance metadata.

Data Quality Principles

Favor compact, reusable facts over long prompt blobs.
Preserve provenance (why/when/type/derived_from) for debuggability.
Detect and track contested facts instead of silently overwriting.
Keep noisy assistant-heavy memories from crowding concise user profile signals.

Observability

API metrics:

request totals and latencies
auth/key-rotation failure counts
HTTP status totals
flash pipeline counters/gauges

Flash metrics include:

mode (async vs sync)
worker count
queue depth/capacity
enqueued/dropped/runs/failures
maintenance cycles

Optimization Notes

Ingest optimization:
- Use compact writes and async maintenance where possible.
- Keep expensive lifecycle/compaction/inference off critical path in async mode.
Retrieve optimization:
- Optimize candidate pool quality before heavy ranking.
- Preserve top-k precision with diversity constraints and intent caps.
Storage optimization:
- Use PostgreSQL in production.
- Keep compaction and inferred-memory lifecycle active to bound growth.

Known Risks / Follow-ups

Truncation risk:
- Compaction can remove useful tail context if not preserved as structured facts/provenance.
- Track with the TODO item for truncation-safe ingest compaction.
Async queue pressure:
- In async mode, full queue can drop maintenance tasks (orbit_flash_pipeline_dropped_total).
- Monitor queue depth and drops.

Change Management Rules

When modifying engine behavior:

Update this manual in the same PR.
Add/adjust tests for changed paths.
Expose new operational state in metrics if it affects runtime reliability.