From c24bbb8c81cbc033dc1d8ebe037d2ae188ecd1e5 Mon Sep 17 00:00:00 2001 From: Kris Zyp Date: Mon, 22 Dec 2025 12:04:39 -0700 Subject: [PATCH 1/2] Describe controlled route flow --- docs/developers/replication/index.md | 25 ++++++++++++++++++++----- 1 file changed, 20 insertions(+), 5 deletions(-) diff --git a/docs/developers/replication/index.md b/docs/developers/replication/index.md index 931dd755..458fce0e 100644 --- a/docs/developers/replication/index.md +++ b/docs/developers/replication/index.md @@ -204,15 +204,30 @@ replication: Note that in this example, we are using loop back addresses, which can be a convenient way to run multiple nodes on a single machine for testing and development. -#### Explicit Subscriptions +### Controlled Replication Flow -#### Managing Node Connections and Subscriptions in Harper +By default, Harper will replicate all data in all databases, with symmetric bi-directional flow between nodes. However, there are times when you may want to control the replication flow between nodes, and dictate that data should only be replicated in one direction between certain nodes. This can be done by setting the direction in the `replicates` attribute of the node definition when adding the node or configuring the replication route. For example, to configure a node to only send data to `node-two`, and only receive data from `node-three` you can add the following to the replication route: -By default, Harper automatically handles connections and subscriptions between nodes, ensuring data consistency across your cluster. It even uses data routing to manage node failures. But if you want more control, you can manage these connections manually by explicitly subscribing to nodes. This is useful for advanced configurations, testing, or debugging. +```yaml +replication: + databases: + - data + routes: + - name: node-two + replicates: + sends: true + receives: false + - name: node-three + replicates: + sends: false + receives: true +``` -#### Important Notes on Explicit Subscriptions +When using controlled flow replication, you will typically have different route configurations for each node to every other node. In that case, typically you do want to ensure that you are _not_ replicating the `system` database, since the `system` database containes the node configurations, and replicating the `system` database will cause all nodes to be replicated and have identical route configurations. + +#### Explicit Subscriptions -If you choose to manage subscriptions manually, Harper will no longer handle data consistency for you. This means there’s no guarantee that all nodes will have consistent data if subscriptions don’t fully replicate in all directions. If a node goes down, it’s possible that some data wasn’t replicated before the failure. +By default, Harper automatically handles connections and subscriptions between nodes, ensuring data consistency across your cluster. It even uses data routing to manage node failures. However, you can manage these connections manually by explicitly subscribing to nodes. This should _not_ be used for production replication and should be avoided and exists only for testing, debugging, and legacy migration. This will likely be removed in V5. If you choose to manage subscriptions manually, Harper will no longer handle data consistency for you. This means there’s no guarantee that all nodes will have consistent data if subscriptions don’t fully replicate in all directions. If a node goes down, it’s possible that some data wasn’t replicated before the failure. If you want single direction replication, you can use controlled replication flow described above. #### How to Subscribe to Nodes From b63095107ac537ca62f5481dd3188e4a999e0f0e Mon Sep 17 00:00:00 2001 From: Kris Zyp Date: Tue, 23 Dec 2025 18:03:04 -0700 Subject: [PATCH 2/2] Update host name in config and direction of flow --- docs/developers/replication/index.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/docs/developers/replication/index.md b/docs/developers/replication/index.md index 458fce0e..703f00f3 100644 --- a/docs/developers/replication/index.md +++ b/docs/developers/replication/index.md @@ -206,21 +206,21 @@ Note that in this example, we are using loop back addresses, which can be a conv ### Controlled Replication Flow -By default, Harper will replicate all data in all databases, with symmetric bi-directional flow between nodes. However, there are times when you may want to control the replication flow between nodes, and dictate that data should only be replicated in one direction between certain nodes. This can be done by setting the direction in the `replicates` attribute of the node definition when adding the node or configuring the replication route. For example, to configure a node to only send data to `node-two`, and only receive data from `node-three` you can add the following to the replication route: +By default, Harper will replicate all data in all databases, with symmetric bi-directional flow between nodes. However, there are times when you may want to control the replication flow between nodes, and dictate that data should only be replicated in one direction between certain nodes. This can be done by setting the direction in the `replicates` attribute of the node definition when adding the node or configuring the replication route. For example, to configure a node to only send data to `node-two` (which only receives), and only receive data from `node-three` (which only sends) you can add the following to the replication route: ```yaml replication: databases: - data routes: - - name: node-two - replicates: - sends: true - receives: false - - name: node-three + - host: node-two replicates: sends: false receives: true + - host: node-three + replicates: + sends: true + receives: false ``` When using controlled flow replication, you will typically have different route configurations for each node to every other node. In that case, typically you do want to ensure that you are _not_ replicating the `system` database, since the `system` database containes the node configurations, and replicating the `system` database will cause all nodes to be replicated and have identical route configurations.