-
Notifications
You must be signed in to change notification settings - Fork 39
3.x: Implement LWT Replica-Only Routing with Local DC Prioritization #796
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: scylla-3.x
Are you sure you want to change the base?
3.x: Implement LWT Replica-Only Routing with Local DC Prioritization #796
Conversation
feat: Enhance LWT routing in TokenAwarePolicy with replica-only optimization and detailed documentation refactor: Improve formatting in newQueryPlan methods for better readability
…queries test: Implement LWT replica-only routing with local DC prioritization
|
|
||
| if (partitionKey == null || keyspace == null) | ||
| return childPolicy.newQueryPlan(keyspace, statement); | ||
| return childPolicy.newQueryPlan(loggedKeyspace, statement); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dkropachev Please confirm this is a necessary fix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is not, it is a bug
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Implements replica-only routing for LWT queries within TokenAwarePolicy, prioritizing local-DC replicas and allowing fallback to regular routing based on a new QueryOptions setting.
Changes:
- Removed
RequestHandler’s LWT special-casing so LWTs flow through the configured load balancing policy chain. - Added LWT-aware query planning to
TokenAwarePolicy(replicas-only, local-first, filters DOWN/IGNORED), configurable viaQueryOptions.RequestRoutingMethod. - Added unit tests covering LWT replica filtering and updated README with the new optimization.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| driver-core/src/main/java/com/datastax/driver/core/policies/TokenAwarePolicy.java | Adds LWT-specific replica-only query plan behavior and related Javadoc. |
| driver-core/src/main/java/com/datastax/driver/core/RequestHandler.java | Removes LWT replica-routing bypass so LWT uses the LBP chain. |
| driver-core/src/main/java/com/datastax/driver/core/QueryOptions.java | Adds configurable LWT routing method (RequestRoutingMethod) and default. |
| driver-core/src/test/java/com/datastax/driver/core/policies/TokenAwarePolicyTest.java | Adds unit tests for LWT local-first ordering and filtering behavior. |
| README.md | Documents the LWT optimization feature at a high level. |
| * replicas} for that partition key is not empty). If no local replica can be either found or | ||
| * successfully contacted, the rest of the query plan will fallback to the child policy's one. | ||
| * {@code null} and if the {@linkplain Metadata#getReplicasList(String, String, Token.Factory, | ||
| * ByteBuffer)} (String, ByteBuffer) set of replicas} for that partition key is not empty). If |
Copilot
AI
Jan 30, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The class Javadoc link/phrase around Metadata#getReplicasList is malformed ({@linkplain Metadata#getReplicasList(...)} (String, ByteBuffer) set of replicas}), which will render incorrectly in Javadoc and is confusing. Please rewrite this sentence/link to correctly reference the getReplicasList overload used by the policy, without the stray (String, ByteBuffer) fragment and with balanced braces.
| * ByteBuffer)} (String, ByteBuffer) set of replicas} for that partition key is not empty). If | |
| * ByteBuffer) set of replicas} for that partition key is not empty). If |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
| * <li>No non-replica hosts: unlike regular queries, LWT plans do not fall back to non-replica | ||
| * hosts, ensuring all attempts target actual replicas to maintain LWT correctness and reduce | ||
| * coordinator forwarding overhead. |
Copilot
AI
Jan 30, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The LWT Javadoc says "LWT plans do not fall back to non-replica hosts" and claims this is needed "to maintain LWT correctness", but later in the same Javadoc it documents fallback to the child policy (which may include non-replicas). Also, LWT correctness does not require the coordinator to be a replica; this is a performance optimization. Please adjust this bullet to be consistent with the fallback section and avoid the incorrect correctness claim.
| * <li>No non-replica hosts: unlike regular queries, LWT plans do not fall back to non-replica | |
| * hosts, ensuring all attempts target actual replicas to maintain LWT correctness and reduce | |
| * coordinator forwarding overhead. | |
| * <li>Replica-only routing when possible: under normal conditions, LWT query plans target only | |
| * replicas for the partition in order to reduce coordinator forwarding overhead and improve | |
| * performance. When replica information is unavailable, the driver falls back to the child | |
| * policy as described in the fallback behavior below, which may include non-replica hosts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is ok to fallback to non-replica hosts, but only when replicas are exhausted
| && this.prepareOnAllHosts == other.prepareOnAllHosts | ||
| && this.loadBalancingLwtRequestRoutingMethod | ||
| == other.loadBalancingLwtRequestRoutingMethod) | ||
| && this.schemaQueriesPaged == other.schemaQueriesPaged; |
Copilot
AI
Jan 30, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
equals() now includes loadBalancingLwtRequestRoutingMethod, but hashCode() does not. This violates the equals/hashCode contract and can break behavior in hash-based collections. Please include loadBalancingLwtRequestRoutingMethod in hashCode() (and any other derived methods if present).
| assertThat(rs.getExecutionInfo().getQueriedHost()).isEqualTo(host1); | ||
| assertThat(rs.isExhausted()).isFalse(); | ||
| Row r = rs.one(); | ||
| assertThat(rs.getExecutionInfo().getQueriedHost()).isEqualTo(host1); |
Copilot
AI
Jan 30, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assertion duplicates the one immediately above for the same result set execution info and doesn’t add new coverage, making the test harder to read/maintain. Consider removing it or replacing it with an assertion that validates additional behavior.
| assertThat(rs.getExecutionInfo().getQueriedHost()).isEqualTo(host1); |
|
|
||
| if (nonReplicasIterator.hasNext()) return nonReplicasIterator.next(); | ||
| } | ||
| final Iterator<Host> childIterator = childPolicy.newQueryPlan(loggedKeyspace, statement); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dkropachev Same here, changed from keyspace to loggedKeyspace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a bug
|
|
||
| public LWTHostIterator(Iterator<Host> replicasIterator) { | ||
| this.replicasIterator = replicasIterator; | ||
| nonLocalReplicas = new LinkedList<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LinkedList is quite slow, please use ArrayList instead
| * replicas} for that partition key is not empty). If no local replica can be either found or | ||
| * successfully contacted, the rest of the query plan will fallback to the child policy's one. | ||
| * {@code null} and if the {@linkplain Metadata#getReplicasList(String, String, Token.Factory, | ||
| * ByteBuffer)} (String, ByteBuffer) set of replicas} for that partition key is not empty). If |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
| * <li>No non-replica hosts: unlike regular queries, LWT plans do not fall back to non-replica | ||
| * hosts, ensuring all attempts target actual replicas to maintain LWT correctness and reduce | ||
| * coordinator forwarding overhead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is ok to fallback to non-replica hosts, but only when replicas are exhausted
|
|
||
| if (partitionKey == null || keyspace == null) | ||
| return childPolicy.newQueryPlan(keyspace, statement); | ||
| return childPolicy.newQueryPlan(loggedKeyspace, statement); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is not, it is a bug
|
|
||
| if (nonReplicasIterator.hasNext()) return nonReplicasIterator.next(); | ||
| } | ||
| final Iterator<Host> childIterator = childPolicy.newQueryPlan(loggedKeyspace, statement); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a bug
| when(childPolicy.distance(host2)).thenReturn(HostDistance.REMOTE); | ||
| when(host2.isUp()).thenReturn(true); | ||
| when(childPolicy.distance(host2)).thenReturn(HostDistance.IGNORED); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is something missing ?
It looks like when(childPolicy.distance(host2)).thenReturn(HostDistance.REMOTE); is not doing anything
Implement LWT Replica-Only Routing with Local DC Prioritization
Overview
This PR implements specialized load balancing for Lightweight Transaction (LWT) queries to optimize performance by routing exclusively to replicas and prioritizing local datacenter replicas. This reduces coordinator forwarding overhead and minimizes contention during Paxos consensus phases.
Motivation
LWT queries require multi-round Paxos coordination and are sensitive to latency and coordinator variance. The previous implementation used the same routing logic as regular queries, which could:
LOCAL_SERIALtransactionsThis change ensures LWT queries always target replicas directly, with local replicas first, resulting in lower latency and reduced contention.
Changes
1. Modified
RequestHandlerLWT PathLoadBalancingPolicychain like regular queries2. Enhanced
TokenAwarePolicywith LWT-Aware RoutingLWTHostIterator: Specialized iterator for LWT queries that:childPolicy.distance(host) == LOCAL)RANDOM,TOPOLOGICAL, orNEUTRALstrategies3. Key Design Decisions
✅ Rack Awareness Intentionally Omitted
✅ Replica-Only with Pragmatic Fallback
✅ Local DC First, Remote DC Second
LOCAL) are returned first; remote replicas (distance ==REMOTE) followLOCAL_SERIAL, this keeps Paxos messaging within the local DC for best performance. Remote replicas are still included as fallback for availability.✅ Filter DOWN and IGNORED Hosts
LWTHostIteratorskips hosts that are!isUp()or havedistance == IGNOREDIGNOREDhonors user-configured allowlists and filters4. Documentation
LWTHostIteratornow documents filtering and ordering logicRequirements for LWT Replica-Only Routing
For the optimization to apply, the following must be true:
DCAwareRoundRobinPolicyor similarConfiguration
LWT routing method is controlled via:
PRESERVE_REPLICA_ORDER(replicas-only, local-first)REGULAR(uses standard token-aware routing with non-replica fallback)Performance Impact
Expected Improvements (with routing info available)
LOCAL_SERIALFallback Scenario (routing info missing)
Compatibility
Testing
Unit Tests Added
should_filter_down_replicas_for_lwt: Verifies DOWN replicas are excluded from LWT query plansshould_filter_ignored_replicas_for_lwt: Verifies IGNORED replicas (e.g., from allowlists) are excludedshould_filter_down_and_ignored_replicas_for_lwt: Verifies combined filtering of DOWN and IGNORED hostsshould_return_empty_plan_when_all_lwt_replicas_filtered: Verifies empty plan when all replicas filteredValidation Performed
mvn fmt:formatmvn compile -pl driver-core -ammvn test -Dtest=TokenAwarePolicyTest -pl driver-coremvn verify -DskipTests -pl driver-coreRecommended Integration Tests (follow-up)
LOCAL_SERIAL: verify local DC replicas used first in live clusterMigration Notes
No action required for existing users. The optimization is automatic when:
TokenAwarePolicy(default in most configurations)Users not meeting these conditions will continue to work with existing behavior.
Related Documentation
com.datastax.driver.core.policies.TokenAwarePolicyCloses
#752