Orbit Engine Manual
Orbit Engine Manual
This is the single internal system manual for the Orbit engine. When architecture or runtime behavior changes, update this file in the same PR.
Purpose
Orbit’s engine converts raw application events into compact, high-signal memory that can be retrieved with low latency and high personalization quality.
Core contract:
ingest: accept signals and persist/update memoryretrieve: return ranked, relevant contextfeedback: learn from outcomes
Runtime Architecture
High-level components
InputProcessor(stage1_input)- Cleans input and creates semantic/raw embeddings.
DecisionLogic(stage2_decision)- Scores the event, assigns decay policy, and decides tier/store behavior.
- Storage (
SQLiteStorageManagerorSQLAlchemyStorageManager)- Durable memory persistence and indexed candidate search.
RetrievalRanker+ retrieval service- Candidate ranking, intent-aware reweighting, diversity controls.
AdaptivePersonalizationEngine- Inferred memory generation from repeated patterns/feedback.
LearningLoop- Feedback-driven updates to ranking/importance/decay learners.
Ingest Path
Ingest is write-oriented and should stay lightweight from the caller perspective.
Flow:
- Event -> processed representation.
- Decision -> store/discard and tier choice.
- Core memory write.
- Flash pipeline maintenance (sync or async mode).
Flash Pipeline (Ingest-side maintenance)
Goal: keep database memory clean/compact without blocking ingest-critical work.
Tasks:
- conflict/lifecycle checks
- inferred pattern generation
- cluster compaction
- periodic maintenance hooks (decay/recalibration cadence)
Modes
sync(default): maintenance executes inline.async: maintenance runs in background queue workers.
Config:
MDE_FLASH_PIPELINE_MODE=sync|asyncMDE_FLASH_PIPELINE_WORKERSMDE_FLASH_PIPELINE_QUEUE_SIZEMDE_FLASH_PIPELINE_MAINTENANCE_INTERVAL
Operational counters are exposed via /v1/metrics.
Retrieve Path
Retrieve is latency-sensitive and the primary runtime bottleneck.
Current strategy:
- Semantic candidate preselection.
- Learned ranking.
- Query-aware reweighting and diversity handling.
- Intent caps and inferred-memory probe coverage.
- Ranked memory response with provenance metadata.
Data Quality Principles
- Favor compact, reusable facts over long prompt blobs.
- Preserve provenance (
why/when/type/derived_from) for debuggability. - Detect and track contested facts instead of silently overwriting.
- Keep noisy assistant-heavy memories from crowding concise user profile signals.
Observability
API metrics:
- request totals and latencies
- auth/key-rotation failure counts
- HTTP status totals
- flash pipeline counters/gauges
Flash metrics include:
- mode (
asyncvssync) - worker count
- queue depth/capacity
- enqueued/dropped/runs/failures
- maintenance cycles
Optimization Notes
- Ingest optimization:
- Use compact writes and async maintenance where possible.
- Keep expensive lifecycle/compaction/inference off critical path in async mode.
- Retrieve optimization:
- Optimize candidate pool quality before heavy ranking.
- Preserve top-k precision with diversity constraints and intent caps.
- Storage optimization:
- Use PostgreSQL in production.
- Keep compaction and inferred-memory lifecycle active to bound growth.
Known Risks / Follow-ups
- Truncation risk:
- Compaction can remove useful tail context if not preserved as structured facts/provenance.
- Track with the TODO item for truncation-safe ingest compaction.
- Async queue pressure:
- In async mode, full queue can drop maintenance tasks (
orbit_flash_pipeline_dropped_total). - Monitor queue depth and drops.
- In async mode, full queue can drop maintenance tasks (
Change Management Rules
When modifying engine behavior:
- Update this manual in the same PR.
- Add/adjust tests for changed paths.
- Expose new operational state in metrics if it affects runtime reliability.