Database Intent And Revision Contract¶
Purpose¶
This document defines the canonical database contract across:
ValueEngineDesignDocsAppGenerator- refinement control-plane flows
- generated app artifacts
- runtime migration application
The current system persists useful build artifacts, but the database layer is still only partially explicit. This document makes the intended contract clear.
Core Decision¶
Mozaiks should treat database development the same way it treats UI and module generation:
- design intent is generated first
- intent is persisted as a typed artifact
- staged app output includes the current canonical database artifact
- refinement compares old intent to new intent
- migration application is explicit and safety-gated
The source of truth is not a sampled live collection and not a prompt-only description.
The source of truth is a persisted database intent artifact.
In the canonical orchestration model, database intent revision is routed by the builder session loop and executed by scoped refinement workers or targeted workflow re-entry. It is not owned by ordinary workflow-local AG2 handoffs.
Current Implementation Boundary¶
This contract is implemented today as generator guidance, staged app artifacts, runtime ctx.persistence injection, database intent loading, index application, and additive migration application.
Current truth:
database_intent_bundleis the canonical generated database planning object.AppGeneratorwrites that object toconfig/database_intent.jsonwhen it is present.- additive refinement plans may be staged under
config/database_migrations/{migration_id}.json. - generated modules use
backend/schemas.pyfor typed document/request shapes;backend/models.pyandbackend/models/*.pyare not canonical outputs. backend/repo.pyowns persistence operations and should be derived from database intent where possible.- the OSS runtime injects
ctx.persistenceinto module actions when anapp_idis available; generated repo code usesctx.persistence.collection(module_id, entity_name). - the OSS runtime loads promoted
config/database_intent.jsonas app metadata. It applies declared collection indexes idempotently at platform startup. It loadsconfig/database_migrations/*.jsonand applies only supported additive operations with migration history. Destructive migrations are not supported. - the OSS runtime does not inject
ctx.dbinto module actions. - generated repo code must not require or emit
ctx.db, importget_mongo_client(), or hardcode database names. - historical project database managers are reference material only; do not port or copy them into generated apps.
What This Contract Covers¶
This contract covers:
- initial app-build database design
- workflow-stage handoff of database intent
- staged app-bundle persistence of database intent
- refinement-time schema diffing
- migration-plan persistence
- runtime-safe application of additive changes
This contract does not assume:
- SQL databases
- ORM-managed schema
- destructive auto-migrations
- preservation of pre-production drift
Canonical Ownership¶
Use these ownership rules.
| Concern | Owner |
|---|---|
| Runtime/session collections | mozaiksai runtime |
| Builder artifact collections | factory_app workflows persisted through mozaiksai |
| App business collections | generated module/control-plane surfaces |
| Migration planning | AppGenerator + refinement control plane |
| Migration application | platform/backend runtime |
Canonical Persistence Namespaces¶
Collapse framework-owned metadata into one canonical Mongo namespace:
mozaiksai
That namespace should own:
- runtime collections
- builder artifact collections
- refinement/migration metadata
The current mixed names:
autogen_ai_agentsMozaiksAImozaiks
should be treated as drift to remove over time.
Build Sequence Contract¶
Phase 1: ValueEngine¶
ValueEngine owns concept intent and coarse planning hints.
It should persist:
ValueManifestBuildPlan
It should not finalize database structure.
It may emit:
- domain/entity hints
- capability-pack hints
- surface candidate hints
But final collection ownership belongs downstream.
Phase 2: DesignDocs¶
DesignDocs is the first workflow that should produce a canonical database artifact.
It should emit two database outputs:
database_markdown- human-readable rationale and explanation
database_intent_bundle- typed machine-readable contract
database_intent_bundle is the real handoff object.
Phase 3: AppGenerator¶
AppGenerator consumes database_intent_bundle and compiles it into staged app artifacts.
The canonical staged artifact path should be:
If the run is a refinement and a migration is needed, AppGenerator should also stage:
This replaces the older idea of writing migrations under backend/database/migrations/, which assumes a backend topology that is not the canonical app-bundle contract.
Phase 4: Promotion¶
Promotion copies the approved database artifacts along with the rest of the app bundle.
The promoted app root should contain:
config/database_intent.json- optional
config/database_migrations/*.json
Canonical Database Intent Artifact¶
The canonical artifact is database_intent_bundle.
It should be stored in persistence and also written to the staged app bundle as config/database_intent.json.
Hosted product/platform workspaces must keep product-owned collection metadata outside app/config/database_intent.json. That path is reserved for generated app persistence intent consumed by the OSS runtime. Do not place hosted collection aliases, proprietary hosted collection names, or host-system authority records in generated-app database intent.
Minimum shape:
{
"version": "1",
"app_id": "app_123",
"artifact_version_id": "art_456",
"surfaces": [
{
"surface_id": "projects",
"surface_kind": "module",
"collections": [
{
"name": "projects",
"scope": "app",
"ownership": {
"surface_id": "projects",
"surface_kind": "module"
},
"fields": [
{"name": "project_id", "type": "string", "required": true},
{"name": "app_id", "type": "string", "required": true},
{"name": "status", "type": "string", "required": true}
],
"indexes": [
{"keys": [["app_id", 1], ["project_id", 1]], "unique": true}
],
"search_by": "project_id",
"lifecycle": {
"write_mode": "module_action",
"migration_policy": "additive_only"
}
}
]
}
],
"shared_collections": [],
"policies": {
"default_scope_field": "app_id",
"allow_destructive_migrations": false
}
}
Required Fields In database_intent_bundle¶
At minimum, each collection intent must declare:
namescopeownership.surface_idownership.surface_kindfieldsindexessearch_bywhen updates are supportedlifecycle.write_modelifecycle.migration_policy
Field entries should include:
nametyperequired- optional
default - optional
enum - optional
nullable
Module-Level Collection Ownership¶
Module ownership does not need a separate top-level canonical database.yaml file yet.
Instead, module-level collections should be declared inside database_intent_bundle.surfaces[*].collections[*] with:
surface_kind=modulesurface_id=<module_id>
That keeps one canonical database source of truth while still expressing module ownership clearly.
Generated module files such as:
backend/repo.pybackend/policy.pybackend/schemas.py
should be derived from this artifact, not act as the schema source of truth.
Persistence Collections For Database Contracts¶
Add canonical builder metadata collections under mozaiksai:
DatabaseIntentsDatabaseMigrations
DatabaseIntents¶
Stores the latest and historical typed database intent artifacts.
Suggested keys:
app_idartifact_version_idbuild_idchange_classdatabase_intent_bundlecreated_atupdated_at
DatabaseMigrations¶
Stores generated migration plans and application status.
Suggested keys:
migration_idapp_idbase_artifact_version_idtarget_artifact_version_idchange_classdiff_summarymigration_documentstatusapplied_atwarnings
Revision And Refinement Contract¶
Every refinement that can affect business data must compare:
- previous
database_intent.json - new
database_intent.json
The diff output is the basis for the migration plan.
The current helper in factory_app/workflows/AppGenerator/tools/schema_migration.py is the right starting point, but it should be treated as part of this contract rather than a standalone helper.
Change-Class Rules¶
patch¶
Default rule:
- database intent should not change
If DB changes appear in a patch refinement:
- route must escalate scope
- do not auto-apply
design¶
Default rule:
- database intent is frozen
Visual or layout refinements should not mutate collection intent.
feature¶
Default rule:
- additive changes only
Allowed:
- new collection
- new optional field
- new field with safe default
- new non-destructive index
Blocked by default:
- field removal
- collection removal
- type narrowing
- unique constraint that would invalidate existing data
core¶
Default rule:
- create a new upstream concept revision
- mark downstream database intents stale
core is not an in-place destructive migration flow.
Safe Migration Categories¶
Safe to auto-apply:
- create collection
- add nullable field
- add field with deterministic backfill/default
- add non-conflicting index
Needs explicit review:
- rename field
- make optional field required
- add unique index on existing dirty data
- change field type
Blocked by default:
- drop collection
- drop field
- destructive data rewrite
Runtime Application Contract¶
The runtime/platform layer should apply migrations only from the staged/promoted database migration artifact.
It should:
- load
config/database_intent.json - ensure declared indexes exist
- load any pending
config/database_migrations/*.json - record applied migration ids
- reject blocked/destructive operations unless explicitly approved by policy
Current implementation status: runtime loads config/database_intent.json, ensures declared indexes exist, loads config/database_migrations/*.json, and records migration state in mozaiksai.AppDatabaseMigrations. Supported migration operations are limited to ensure_collection and ensure_index. Runtime does not mutate existing documents, apply destructive changes, execute arbitrary migration code, or support operator-approved destructive migrations yet.
Database startup policy is controlled by MOZAIKS_DATABASE_STARTUP_POLICY:
best_effortis the default for backward compatibility. Index or migration failures are logged and platform startup continues.requiredis recommended for production persistent generated apps. Index or migration failures fail startup with app id, app root, and original error context.
App business data is stored in the generated-app database selected by:
- an injected database name when the runtime adapter is constructed
MOZAIKS_APP_DATABASE_NAMEMOZAIKS_APPS_DATABASE- fallback
mozaiks_apps
Migration history and locking:
mozaiksai.AppDatabaseMigrationsdoubles as the migration lock collection.- the runtime atomically claims a migration by inserting an
in_progressrecord for(app_id, migration_id)before operations begin. - the history collection has a unique
(app_id, migration_id)index, so two platform/runtime instances cannot both claim the same app migration. in_progress: written before migration operations begin, withclaimed_atandlock_owner.applied: written after all operations succeed.failed: written when an operation fails, includingerror_type,error_message,failed_operation_index, andfailed_operation_summary.
Retry policy is conservative: an existing applied record with the same hash is skipped; an existing applied record with a different hash errors; existing in_progress or failed records error until an operator clears or repairs the history record. There is no automatic lock takeover in the first pass. in_progress means another instance is applying the migration or a prior instance crashed after claim. This avoids silently reapplying ambiguous migration state.
Operator health inspection is read-only. The runtime helper get_migration_health_report() returns:
{
"summary": {"total": 12, "applied": 10, "in_progress": 1, "failed": 1, "unknown": 0},
"items": [
{
"app_id": "app_123",
"migration_id": "001_projects",
"status": "failed",
"migration_hash": "...",
"failed_at": "...",
"error_message": "...",
"failed_operation_index": 1,
"is_blocker": true,
"unknown_status": false
}
],
"has_blockers": true,
"has_unknown_statuses": false
}
The helper may filter by app_id and status, and it enforces a result limit. It does not mutate history, clear failed records, repair stuck in_progress records, retry migrations, or take over locks. Repair/clear workflows remain future operator tooling.
Operators can inspect the same report from the CLI:
The command returns 0 when there are no blockers or unknown statuses, 1 when failed/in-progress blockers or unknown statuses exist, and 2 for configuration or Mongo/report loading errors. It is read-only and does not print Mongo credentials.
Generated App Persistence Runbook¶
Generated app persistence is now supported end to end for module-owned business data. The canonical generated artifacts are:
config/database_intent.json
config/database_migrations/{migration_id}.json
modules/{module_id}/backend/repo.py
modules/{module_id}/backend/policy.py
modules/{module_id}/backend/schemas.py
Generated apps must not use:
At runtime, ModuleContext exposes ctx.persistence when the module request has an app_id. ctx.db is not injected and is not canonical. Generated backend/repo.py is the only generated backend layer that should touch persistence, and it should use ctx.persistence.collection(module_id, entity_name) with values that match config/database_intent.json. Generated module code must not call get_mongo_client() or hardcode database names.
Layer responsibilities:
handler.pydispatches action calls to service methods only.service.pyowns orchestration, validation, and event emission after state is committed; it calls repo methods for data access.repo.pyowns persistence access throughctx.persistence.policy.pybuilds scope and domain filters.schemas.pyowns typed document shapes and pure normalization helpers.
Runtime app loading behavior:
- missing
config/database_intent.jsonis allowed for non-persistent apps. - valid
config/database_intent.jsonis loaded and indexed by(module_id, entity_name). - invalid JSON or invalid shape fails app load.
- declared indexes are applied idempotently.
- additive migration files are loaded from
config/database_migrations/*.json. - migration states are recorded in
mozaiksai.AppDatabaseMigrations. - supported migration operations are
ensure_collectionandensure_index. - destructive migrations and arbitrary migration code are not supported.
- production persistent apps should set
MOZAIKS_DATABASE_STARTUP_POLICY=required.
Compact neutral example:
{
"version": "1",
"surfaces": [
{
"surface_id": "projects",
"surface_kind": "module",
"collections": [
{
"module_id": "projects",
"name": "projects",
"entity_name": "projects",
"indexes": [
{
"name": "project_owner_created_at",
"keys": [
{"field": "owner_id", "order": 1},
{"field": "created_at", "order": -1}
]
}
]
}
]
}
]
}
class ProjectsRepo:
async def _collection(self, ctx):
persistence = getattr(ctx, "persistence", None)
if persistence is None:
raise RuntimeError("Persistence is not available for this app context.")
return persistence.collection("projects", "projects")
{
"migration_id": "001_projects_tasks_indexes",
"version": "1",
"operations": [
{"type": "ensure_collection", "module_id": "projects", "entity_name": "projects"},
{
"type": "ensure_index",
"module_id": "projects",
"entity_name": "projects",
"index": {
"name": "project_owner_created_at",
"keys": [{"field": "owner_id", "order": 1}, {"field": "created_at", "order": -1}]
}
}
]
}
Current coverage includes runtime persistence tests, generated app persistence smokes, downstream persistent projects generation replay, and live AppPlanAgent fixture replay.
data_entity Contract Upgrade¶
The existing data_entity runtime path is directionally correct but separate from generated module repo persistence.
Today it accepts:
schemaindexeswrite_strategy
Current support can validate required fields, create declared indexes, and enforce basic types/enums in the workflow data-entity lane. That does not mean generated module repos should use ctx.db; generated module repos use ctx.persistence instead.
To fully match this contract, runtime/platform persistence still needs:
- support safe deferred flush semantics
- record applied collection setup state
Context Loading Contract¶
Workflows should continue to read builder artifacts through context_variables.yaml data_reference sources.
Add canonical context variables such as:
database_intent_bundledatabase_migration_plandatabase_migration_status
Do not make downstream workflows depend on ad hoc collection names that drift from the persisted source-of-truth artifact.
Current Drift To Remove¶
These are known inconsistencies in the current system:
ValueEnginewritesValueManifests, while downstream contexts still read fromConcepts.- Builder metadata is split across
autogen_ai_agents,MozaiksAI, andmozaiks. - prompts and tests must not fall back to old
ctx.db,backend/database/*, orbackend/models/*artifacts. - generated repo guidance must stay aligned with
ctx.persistenceas the runtime-supported persistence boundary.
Recommended Implementation Order¶
- Normalize concept persistence naming.
- unify
ValueManifestsvsConcepts - Introduce
database_intent_bundleas a typedDesignDocsartifact. - Persist it to
mozaiksai.DatabaseIntents. - Write
config/database_intent.jsonduringAppGenerator. - Move migration output to
config/database_migrations/. - Persist migration docs to
mozaiksai.DatabaseMigrations. - Keep generated
repo.pyguidance aligned withctx.persistence. - Exercise a persistent generated app smoke test.
- Teach refinement routing to apply the change-class DB rules in this doc.
Relationship To Other Docs¶
- end-to-end-build-lifecycle.md
- overall builder lifecycle
- generated-frontend-surface-contract.md
- persistent frontend surface ownership and realization boundaries
- refinement-control-plane.md
- refinement routing and artifact-version control plane
This document defines the missing database layer that those docs assume.