Skip to content

refactor(metadata-db): decouple physical tables from storage paths#1696

Open
shiyasmohd wants to merge 2 commits intomainfrom
shiyasmohd/decouple-phy-tables-in-metadata-db
Open

refactor(metadata-db): decouple physical tables from storage paths#1696
shiyasmohd wants to merge 2 commits intomainfrom
shiyasmohd/decouple-phy-tables-in-metadata-db

Conversation

@shiyasmohd
Copy link
Contributor

Split physical_tables into two tables to enable data sharing across
datasets with the same spec:

  • physical_tables: meta table with dataset identity and active revision pointer
  • physical_table_revisions: storage paths and writer info

This allows multiple datasets to share the same physical storage while
maintaining independent activation states per (namespace, name, hash, table).

Changes:

  • Add migration to split tables and preserve FK compatibility
  • Update unique constraint to (dataset_namespace, dataset_name, manifest_hash, table_name)
  • Modify register/mark_active/mark_inactive APIs to include dataset identifiers
  • Update all queries to join tables appropriately
  • Add index on active_revision_id for query performance

@shiyasmohd shiyasmohd self-assigned this Feb 5, 2026
@LNSD
Copy link
Contributor

LNSD commented Feb 5, 2026

As this PR is changing the way we manage physical tables, I think we should add/update a docs/features/*.md document to reflect the new "link/unlink" behaviour.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants