fix(glue): Support create_table for S3 Tables federated databases by jamesbornholt · Pull Request #3058 · apache/iceberg-python

jamesbornholt · 2026-02-17T07:28:17Z

Rationale for this change

GlueCatalog.create_table() fails for S3 Tables because it tries to write Iceberg metadata to a warehouse-derived location before registering the table in Glue. S3 Tables manages storage internally, so the location is not known until the table exists in the service.

Detect S3 Tables federated databases (FederatedDatabase.ConnectionType == "aws:s3tables") and use a two-phase create: first create a minimal table entry in Glue to have S3 Tables allocate storage, then write Iceberg metadata to the managed location and update the Glue table with the metadata pointer. On failure, clean up the table entry.

This follows the same "pre-create then update" pattern used by the Java GlueCatalog when LakeFormation is enabled
(GlueTableOperations.createGlueTempTableIfNecessary).

Are these changes tested?

moto doesn't mock with enough fidelity to test this directly. I've tested locally against both S3 Tables and vanilla Glue Iceberg and confirmed CreateTable still works on both paths.

Are there any user-facing changes?

No.

GlueCatalog.create_table() fails for S3 Tables because it tries to write Iceberg metadata to a warehouse-derived location before registering the table in Glue. S3 Tables manages storage internally, so the location is not known until the table exists in the service. Detect S3 Tables federated databases (FederatedDatabase.ConnectionType == "aws:s3tables") and use a two-phase create: first create a minimal table entry in Glue to have S3 Tables allocate storage, then write Iceberg metadata to the managed location and update the Glue table with the metadata pointer. On failure, clean up the table entry. This follows the same "pre-create then update" pattern used by the Java GlueCatalog when LakeFormation is enabled (GlueTableOperations.createGlueTempTableIfNecessary).

kevinjqliu · 2026-02-17T15:32:46Z

thanks for the PR! do you know of a way to test this change?

Add three moto-based tests for the S3 Tables code path in GlueCatalog: - create_table succeeds and uses the table warehouse location - create_table rejects an explicit location for S3 Tables databases - create_table raises TableAlreadyExistsError for duplicate tables Moto does not support FederatedDatabase or S3 Tables storage allocation, so the tests patch FakeDatabase.as_dict to return FederatedDatabase and FakeTable.__init__ to inject a StorageDescriptor.Location.

jamesbornholt · 2026-02-17T18:37:32Z

I took a stab at writing some tests. It requires monkeypatching moto to emulate Glue's behavior here, which is not super elegant, but it captures the right behavior.

kevinjqliu · 2026-02-17T19:23:58Z

@geruh could you take a look when you get a chance?

geruh

Thanks for raising this @jamesbornholt! I left a few comments. At a high level, Looks like we can already both load and commit against s3t tables federated to glue. This would just fill the create_table gap.

Initially I was a bit hesitant about adding support for this use case as it requires 4/5 api calls and in the non s3t federated case it adds an additional call. Although I don't think there is any way around this.

WDYT?

pyiceberg/catalog/glue.py

geruh · 2026-02-17T19:30:19Z

pyiceberg/catalog/glue.py

+
+    def _create_table_s3tables(
+        self,
+        identifier: str | Identifier,


nit: we have some redundancy with these params passed in: ident, db_name, tbl_name maybe we can derive?

pyiceberg/catalog/glue.py

geruh · 2026-02-17T22:04:54Z

pyiceberg/catalog/glue.py

+                )
+            raise
+
+        return self.load_table(identifier=identifier)


we could probably avoid the last load table here with the update table response

glue UpdateTable doesn't return anything, but we can actually just construct the table from the staged metadata the same way we do it in the commit path: https://github.com/jamesbornholt/iceberg-python/blob/89a129c61b5f6320e78d7875a615b2383d590fa1/pyiceberg/catalog/glue.py#L381-L387. i've done that for both the S3 Tables and regular creation paths now.

…reate_table_s3tables signature - Extract "aws:s3tables" string literal into GLUE_CONNECTION_S3_TABLES constant - Remove redundant database_name/table_name params from _create_table_s3tables; derive them from identifier inside the method instead

Avoid a redundant GetTable + GetObject round-trip at the end of create_table and _create_table_s3tables by constructing the Table from the staged metadata that was just written, matching the pattern used in commit_table.

jamesbornholt · 2026-02-18T18:19:24Z

Thanks for raising this @jamesbornholt! I left a few comments. At a high level, Looks like we can already both load and commit against s3t tables federated to glue. This would just fill the create_table gap.

Initially I was a bit hesitant about adding support for this use case as it requires 4/5 api calls and in the non s3t federated case it adds an additional call. Although I don't think there is any way around this.

WDYT?

I was worried about this too and it's definitely unfortunate, but I also don't see a way around it, and since it's only on the creation path I think it's OK. I took the suggestion of removing the extra LoadTable request at the end of creation, so the non-federated path nets out as the same number of roundtrips now.

jamesbornholt force-pushed the fix-glue-create-table-s3tables branch from 5633f4e to 9708909 Compare February 17, 2026 07:31

jamesbornholt force-pushed the fix-glue-create-table-s3tables branch from 0c4f375 to 4f9299e Compare February 17, 2026 18:23

geruh reviewed Feb 17, 2026

View reviewed changes

jamesbornholt added 2 commits February 18, 2026 09:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(glue): Support create_table for S3 Tables federated databases#3058

fix(glue): Support create_table for S3 Tables federated databases#3058
jamesbornholt wants to merge 4 commits intoapache:mainfrom
jamesbornholt:fix-glue-create-table-s3tables

jamesbornholt commented Feb 17, 2026

Uh oh!

kevinjqliu commented Feb 17, 2026

Uh oh!

jamesbornholt commented Feb 17, 2026

Uh oh!

kevinjqliu commented Feb 17, 2026

Uh oh!

geruh left a comment

Uh oh!

Uh oh!

geruh Feb 17, 2026

Uh oh!

jamesbornholt Feb 18, 2026

Uh oh!

Uh oh!

geruh Feb 17, 2026

Uh oh!

jamesbornholt Feb 18, 2026

Uh oh!

jamesbornholt commented Feb 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

jamesbornholt commented Feb 17, 2026

Rationale for this change

Are these changes tested?

Are there any user-facing changes?

Uh oh!

kevinjqliu commented Feb 17, 2026

Uh oh!

jamesbornholt commented Feb 17, 2026

Uh oh!

kevinjqliu commented Feb 17, 2026

Uh oh!

geruh left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

geruh Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

jamesbornholt Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

geruh Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

jamesbornholt Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

jamesbornholt commented Feb 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments