Skip to content

Conversation

@wanwiset25
Copy link
Collaborator

@wanwiset25 wanwiset25 commented Jan 26, 2026

Proposed changes

  1. Revert to using masternodes list from snapshot during vote verification.
    This is due to epochInfo.Masternodes requires target block to be downloaded in database but during voting sometimes vote messages arrive before block download. Causing delay in minting blocks.

  2. Chain initialization underflow, tempTCEpoch should have lowest value of 0.

ref: #1628 #1850

Types of changes

What types of changes does your code introduce to XDC network?
Put an in the boxes that apply

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation Update (if none of the other choices apply)
  • Regular KTLO or any of the maintaince work. e.g code style
  • CICD Improvement

Impacted Components

Which part of the codebase this PR will touch base on,

Put an in the boxes that apply

  • Consensus
  • Account
  • Network
  • Geth
  • Smart Contract
  • External components
  • Not sure (Please specify below)

Checklist

Put an in the boxes once you have confirmed below actions (or provide reasons on not doing so) that

  • This PR has sufficient test coverage (unit/integration test) OR I have provided reason in the PR description for not having test coverage
  • Tested in private network from genesis.
  • Provide an end-to-end test plan in the PR description on how to manually test it on the devnet/testnet.
  • Tested the backwards compatibility.
  • Tested with XDC nodes running this version co-exist with those running the previous version.
  • Relevant documentation has been updated as part of this PR
  • N/A

Copilot AI review requested due to automatic review settings January 26, 2026 06:56
@coderabbitai
Copy link

coderabbitai bot commented Jan 26, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix-privchain-initialization

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes consensus v2 edge cases by switching vote signature verification back to using snapshot-derived masternode lists (to avoid requiring the proposed block header to be present) and tightening epoch/chain initialization logic to prevent underflow during TC epoch resolution.

Changes:

  • Verify vote signatures against snapshot.NextEpochCandidates instead of epochInfo.Masternodes.
  • Adjust TC epoch lookup loop to avoid going below epoch 0 (intended).
  • Improve error messages and enrich mining logs with round context; minor gaslimit handling tweak for genesis parent.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
consensus/XDPoS/utils/pool.go Adds Pool.Inspect() helper to log pool contents.
consensus/XDPoS/engines/engine_v2/vote.go Changes vote verification to use snapshot candidates; adds additional logging.
consensus/XDPoS/engines/engine_v2/verifyHeader.go Minor whitespace-only change.
consensus/XDPoS/engines/engine_v2/utils.go Improves GetBlockByEpochNumber error messages with context.
consensus/XDPoS/engines/engine_v2/timeout.go Modifies TC epoch selection loop condition to address underflow.
consensus/XDPoS/engines/engine_v2/mining.go Adds round to “your turn” info logs.
consensus/XDPoS/engines/engine_v2/engine.go Aligns gaslimit verification behavior around genesis parent handling.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 35 to 36
log.Warn("[VerifyVoteMessage] check", "verified", verified, "error", err, "votedBlockHash", vote.ProposedBlockInfo.Hash.Hex(), "number", vote.ProposedBlockInfo.Number, "round", vote.ProposedBlockInfo.Round)

Copy link

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This log.Warn("[VerifyVoteMessage] check", ...) runs for every vote verification (including successful ones), which can create extremely noisy logs on busy networks and hide actionable warnings. Consider removing it or lowering to Debug/Trace, and/or only logging when err != nil or !verified.

Copilot uses AI. Check for mistakes.
}
log.Info("[getTCEpochInfo] Init epochInfo", "number", epochBlockInfo.Number, "round", epochRound, "tcRound", timeoutRound, "tcEpoch", tempTCEpoch)
for epochBlockInfo.Round > timeoutRound {
for epochBlockInfo.Round > timeoutRound || tempTCEpoch > 0 {
Copy link

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The loop condition epochBlockInfo.Round > timeoutRound || tempTCEpoch > 0 does not prevent tempTCEpoch-- from underflowing when tempTCEpoch == 0 but epochBlockInfo.Round > timeoutRound is still true (e.g., if timeoutRound is very small). This can wrap tempTCEpoch to math.MaxUint64 and cause incorrect epoch lookups. Guard the decrement (break when tempTCEpoch == 0) or change the loop logic so the decrement only happens when tempTCEpoch > 0 (e.g., use && plus a final iteration that can reach epoch 0 safely).

Suggested change
for epochBlockInfo.Round > timeoutRound || tempTCEpoch > 0 {
for epochBlockInfo.Round > timeoutRound || tempTCEpoch > 0 {
if tempTCEpoch == 0 {
break
}

Copilot uses AI. Check for mistakes.
if err != nil && parent.Number.Uint64() != 0 { // skip genesis block
return err
}

Copy link

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is trailing whitespace on this blank line; running gofmt should remove it. Some CI/lint setups will fail on whitespace-only diffs.

Copilot uses AI. Check for mistakes.
Comment on lines 44 to 52
p.lock.RLock()
defer p.lock.RUnlock()

for poolKey, objMap := range p.objList {
log.Info("[Inspect] Pool Key:", "poolKey", poolKey, "numObjects", len(objMap))
for objHash, obj := range objMap {
log.Info("[Inspect] Object Hash:", "objHash", objHash.Hex(), "signer", obj.GetSigner().Hex())
}
}
Copy link

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inspect() holds the pool's read lock while emitting per-object log.Info calls. Logging can be slow and will block writers for the duration of the iteration, which can impact consensus responsiveness under load. Consider copying the minimal data you need (keys/counts/signer) under the lock, then releasing the lock before logging (or lower the log level / make it conditional).

Suggested change
p.lock.RLock()
defer p.lock.RUnlock()
for poolKey, objMap := range p.objList {
log.Info("[Inspect] Pool Key:", "poolKey", poolKey, "numObjects", len(objMap))
for objHash, obj := range objMap {
log.Info("[Inspect] Object Hash:", "objHash", objHash.Hex(), "signer", obj.GetSigner().Hex())
}
}
type poolInfo struct {
poolKey string
numObjects int
}
type objInfo struct {
poolKey string
objHash string
signer string
}
p.lock.RLock()
poolInfos := make([]poolInfo, 0, len(p.objList))
var objInfos []objInfo
for poolKey, objMap := range p.objList {
poolInfos = append(poolInfos, poolInfo{
poolKey: poolKey,
numObjects: len(objMap),
})
for objHash, obj := range objMap {
objInfos = append(objInfos, objInfo{
poolKey: poolKey,
objHash: objHash.Hex(),
signer: obj.GetSigner().Hex(),
})
}
}
p.lock.RUnlock()
for _, info := range poolInfos {
log.Info("[Inspect] Pool Key:", "poolKey", info.poolKey, "numObjects", info.numObjects)
}
for _, info := range objInfos {
log.Info("[Inspect] Object Hash:", "objHash", info.objHash, "signer", info.signer)
}

Copilot uses AI. Check for mistakes.
Comment on lines 25 to 29
snapshot, err := x.getSnapshot(chain, vote.GapNumber, true)
if err != nil {
log.Error("[VerifyVoteMessage] Fail to get epochInfo when verifying vote message!", "blockNum", vote.ProposedBlockInfo.Number, "blockHash", vote.ProposedBlockInfo.Hash, "voteHash", vote.Hash(), "voteGapNumber", vote.GapNumber, "err", err)
log.Error("[VerifyVoteMessage] fail to get snapshot for a vote message", "blockNum", vote.ProposedBlockInfo.Number, "blockHash", vote.ProposedBlockInfo.Hash, "voteHash", vote.Hash(), "error", err.Error())
return false, err
}
Copy link

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VerifyVoteMessage now always calls getSnapshot and returns an error if the gap header/snapshot isn't available. This breaks the previously supported timing condition where votes can arrive before required chain data is present (and will also make the existing TestVerifyVoteMessage_HeaderNotPresent fail). Consider deferring verification when the gap header/snapshot isn't available yet (return (false, nil) and log at Debug/Trace) instead of logging an error and returning a hard failure.

Copilot uses AI. Check for mistakes.
@wanwiset25 wanwiset25 force-pushed the fix-privchain-initialization branch 2 times, most recently from ecfb357 to 233f6ec Compare January 26, 2026 08:10
snapshot, err := x.getSnapshot(chain, vote.GapNumber, true)
if err != nil {
log.Error("[VerifyVoteMessage] Fail to get epochInfo when verifying vote message!", "blockNum", vote.ProposedBlockInfo.Number, "blockHash", vote.ProposedBlockInfo.Hash, "voteHash", vote.Hash(), "voteGapNumber", vote.GapNumber, "err", err)
log.Error("[VerifyVoteMessage] fail to get snapshot for a vote message", "blockNum", vote.ProposedBlockInfo.Number, "blockHash", vote.ProposedBlockInfo.Hash, "voteHash", vote.Hash(), "error", err.Error())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does "error", err work in this line ?

// Ensure gas settings are bounded
if err := misc.VerifyGaslimit(parent.GasLimit, header.GasLimit); err != nil {
err = misc.VerifyGaslimit(parent.GasLimit, header.GasLimit)
if err != nil && parent.Number.Uint64() != 0 { // skip genesis block
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parent.Number.Sign() != 0 has better performance than parent.Number.Uint64() != 0

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in another PR

@wanwiset25 wanwiset25 force-pushed the fix-privchain-initialization branch 2 times, most recently from 20d3ebc to 01d7c1d Compare February 9, 2026 09:17
@wanwiset25 wanwiset25 force-pushed the fix-privchain-initialization branch from 12f3998 to 92a712d Compare February 10, 2026 04:25
@gzliudan gzliudan changed the title consensus: fix private chain initialization fix(consensus): fix private chain initialization Feb 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants