[kv] Implement insertIfNotExists on tablet server #2485

beryllw · 2026-01-26T12:54:28Z

Purpose

Linked issue: close #2482

Brief change log

Tests

API and Format

Documentation

beryllw · 2026-01-29T13:04:07Z

fluss-common/src/main/java/org/apache/fluss/row/decode/KeyDecoder.java

+     * @param keyFields the key fields to decode
+     */
+    static KeyDecoder of(RowType rowType, List<String> keyFields) {
+        return CompactedKeyDecoder.createKeyDecoder(rowType, keyFields);


Will add KeyDecoders for Paimon and Iceberg later.

platinumhamburg

LGTM, Just a few minor comments. Overall, this is the implementation with the minimal changes for the current update. Of course, it's not yet performance-optimal—it introduces a notable performance inefficiency: during the PutKV process, the original key is encoded and then immediately decoded again. This could be addressed in a future optimization; I suggest adding a TODO comment for it.

fluss-common/src/main/java/org/apache/fluss/row/decode/KeyDecoder.java

fluss-server/src/main/java/org/apache/fluss/server/replica/ReplicaManager.java

platinumhamburg

There is a critical bug that needs to be fixed

fluss-common/src/main/java/org/apache/fluss/record/KeyRecordBatch.java

platinumhamburg · 2026-01-30T02:30:37Z

I’ve thought about it carefully, and I think KeyRecordBatch is too specific—especially since it’s meant for a temporary solution we all agree isn’t final. We could just remove it entirely and use a simpler approach instead. But if we’re short on time, it’s not a big deal—it doesn’t add much code—so we can keep it for now and clean it up later.

fluss-common/src/main/java/org/apache/fluss/record/KeyRecordBatch.java

fluss-common/src/main/java/org/apache/fluss/row/decode/KeyDecoder.java

wuchong

@beryllw I left some comments. Especially, I added a concurrent test that reveals the correctness issue and should be fixed.

wuchong · 2026-02-01T10:30:39Z

fluss-server/src/test/java/org/apache/fluss/server/replica/ReplicaManagerTest.java

+    }
+
+    @Test
+    void testLookupWithInsertIfNotExistsAutoIncrement() throws Exception {


Could you add an additional test for concurrent lookups with insert-if-not-exists and auto-increment enabled?

The test should:

Spawn multiple threads (e.g., 3) that concurrently perform lookupWithInsert operations on the same set of keys (e.g., 100, 200, 300).

Verify that all threads receive consistent lookup results (i.e., the same inserted values for each key).

Confirm that exactly 3 changelog entries are written to the log tablet—one per unique key.

This will validate that concurrent put-and-relookup operations behave correctly under contention, ensuring idempotency and consistency when auto-increment and conditional inserts are enabled.

wuchong · 2026-02-01T10:32:36Z

fluss-common/src/main/java/org/apache/fluss/row/decode/KeyDecoder.java

+            if (lakeFormat == DataLakeFormat.PAIMON) {
+                // TODO: Implement key decoding support for Paimon lake format
+                throw new UnsupportedOperationException(
+                        "Paimon lake format does not support key decoding");


How much effort would it take to support this? I believe it’s essential—since most clusters will enable lakehouse integration, this will become a common and critical path.

If this requires significant work, please create a blocker issue to track it. And please add tests for this case as well.

fluss-common/src/main/java/org/apache/fluss/row/decode/KeyDecoder.java

wuchong · 2026-02-01T11:03:07Z

fluss-server/src/main/java/org/apache/fluss/server/replica/ReplicaManager.java

+                        timeoutMs,
+                        requiredAcks,
+                        produceEntryData,
+                        null,


This is incorrect. We should use partial updates to modify only the primary key fields. Otherwise, non-primary-key columns—including auto-increment fields—will be overwritten with null. This issue manifests when multiple threads concurrently call lookupAndInsert on the same keys.

To reproduce this, I’ve added a test: testConcurrentLookupWithInsertIfNotExistsAutoIncrement.

Additionally, we should enforce a validation in PutRequest: when performing a put operation, auto-increment fields must be excluded from the target columns. This ensures that auto-incremented values are never accidentally overwritten during updates. That means INSERT INTO t (id, auto_inc) VALUES ... or INSERT INTO t VALUES ... should fail with explicit error message that the auto increment fields can't be updated.

…ertion

xx789633 · 2026-02-02T03:30:33Z

fluss-server/src/main/java/org/apache/fluss/server/replica/ReplicaManager.java

            }
        }
+
+        if (insertIfNotExists) {


Could you please also add a comment here to indicate that there might be a data race and describe how you solve it? The same key might be inserted by another thread after the lookup check passes.

beryllw marked this pull request as draft January 26, 2026 12:54

beryllw force-pushed the insertIfNotExists-poc branch 2 times, most recently from 7aa9f61 to 9f5414c Compare January 29, 2026 08:06

beryllw marked this pull request as ready for review January 29, 2026 08:07

beryllw changed the title ~~[kv] Implement Atomic Lookup-or-Put for KV Tables~~ [kv] Implement insertIfNotExists on tablet server Jan 29, 2026

beryllw force-pushed the insertIfNotExists-poc branch from 5b7aff2 to b2e175a Compare January 29, 2026 12:47

beryllw commented Jan 29, 2026

View reviewed changes

platinumhamburg reviewed Jan 29, 2026

View reviewed changes

platinumhamburg suggested changes Jan 29, 2026

View reviewed changes

fluss-common/src/main/java/org/apache/fluss/record/KeyRecordBatch.java Show resolved Hide resolved

platinumhamburg reviewed Jan 30, 2026

View reviewed changes

fluss-common/src/main/java/org/apache/fluss/record/KeyRecordBatch.java Outdated Show resolved Hide resolved

beryllw requested a review from platinumhamburg January 30, 2026 03:28

beryllw force-pushed the insertIfNotExists-poc branch 2 times, most recently from a2d7e77 to 37efb9a Compare January 30, 2026 14:28

[kv] Implement insertIfNotExists on tablet server

20a2af3

beryllw force-pushed the insertIfNotExists-poc branch from bee2bad to 20a2af3 Compare January 30, 2026 14:48

fix KeyRecordBatch schemaId

a17f1d1

platinumhamburg reviewed Jan 31, 2026

View reviewed changes

fluss-common/src/main/java/org/apache/fluss/row/decode/KeyDecoder.java Outdated Show resolved Hide resolved

beryllw and others added 2 commits January 31, 2026 10:59

add test for all supported primary key types decoding

9151452

add tests

54b4d70

wuchong reviewed Feb 1, 2026

View reviewed changes

beryllw added 2 commits February 2, 2026 00:03

lookup insert if not exists use primary key indexes during record ins…

ba1497a

…ertion

prevent auto-increment columns from being targeted in updates

40cfa8b

xx789633 reviewed Feb 2, 2026

View reviewed changes

[kv] Implement insertIfNotExists on tablet server #2485

Are you sure you want to change the base?

[kv] Implement insertIfNotExists on tablet server #2485

Uh oh!

Conversation

beryllw commented Jan 26, 2026

Purpose

Brief change log

Tests

API and Format

Documentation

Uh oh!

beryllw Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

platinumhamburg left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

platinumhamburg left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

platinumhamburg commented Jan 30, 2026

Uh oh!

Uh oh!

Uh oh!

wuchong left a comment

Choose a reason for hiding this comment

Uh oh!

wuchong Feb 1, 2026

Choose a reason for hiding this comment

Uh oh!

wuchong Feb 1, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

wuchong Feb 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xx789633 Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

wuchong Feb 1, 2026 •

edited

Loading

xx789633 Feb 2, 2026 •

edited

Loading