Skip to content

Conversation

@chatton
Copy link
Contributor

@chatton chatton commented Jan 13, 2026

Overview

Adding additional tracing within NewServiceHandler

Add OpenTelemetry tracing for all RPC server methods:

- ExecutorService: InitChain, GetTxs, ExecuteTxs, SetFinal
- StoreService: GetBlock, GetState, GetMetadata, GetGenesisDaHeight, GetP2PStoreInfo
- P2PService: GetPeerInfo, GetNetInfo
- ConfigService: GetNamespace, GetSignerInfo

All implementations follow decorator pattern with comprehensive test coverage.
Tracer names: ev-node/{service}-service
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @chatton, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the observability of the system by integrating OpenTelemetry for distributed tracing. It introduces tracing capabilities for the Store, P2P, and Config RPC services, allowing for detailed insights into their execution flow and performance. The tracing can be toggled via configuration, providing flexibility for different environments.

Highlights

  • OpenTelemetry Integration: Introduced OpenTelemetry for distributed tracing across key RPC services.
  • Service Tracing: Added tracing to Store, P2P, and Config RPC service handlers.
  • Conditional Tracing: Tracing is conditionally enabled based on the config.Instrumentation.IsTracingEnabled() flag.
  • Detailed Span Attributes: Spans for RPC calls capture relevant request and response attributes (e.g., block height, hash, peer count, namespace) and record errors.
  • Dependency Updates: Updated go.mod and go.sum to include OpenTelemetry and testify dependencies.
  • Comprehensive Testing: New unit tests ensure the correct functionality and attribute capturing of the tracing wrappers.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@claude
Copy link
Contributor

claude bot commented Jan 13, 2026

Claude finished @chatton's task in 1m 56s —— View job


Code Review Complete

I've performed a comprehensive code review of the tracing implementation. Overall, this is a solid addition that will significantly improve observability. However, there are several issues that need to be addressed.


Critical Issues

1. Missing per-store tracing attributes in GetP2PStoreInfo (pkg/rpc/server/tracing.go:140-158)

The GetP2PStoreInfo method only records the store count but doesn't capture individual store information. This is a significant observability gap - you won't be able to see which stores are at which heights in your traces. Gemini's review also noted this, and the test coverage confirms the issue (line 315 only checks store_count).

Recommendation: Add attributes for each store's height and label:

span.SetAttributes(
    attribute.Int("store_count", len(res.Msg.Stores)),
)
for i, store := range res.Msg.Stores {
    span.SetAttributes(
        attribute.String(fmt.Sprintf("store_%d_label", i), store.Label),
        attribute.Int64(fmt.Sprintf("store_%d_height", i), int64(store.Height)),
    )
}

Fix this →


High Priority Issues

2. Simplified GetBlock span attribute logic (pkg/rpc/server/tracing.go:32-70)

The current implementation is already clean after applying Gemini's suggestion to consolidate the attribute building logic. Good job on this pattern - building attributes before starting the span is the right approach.

However, note that the hex encoding is done inline which is good for consistency.


3. Significant code duplication across decorators (pkg/rpc/server/tracing.go:18-270)

There's substantial boilerplate repeated across all traced methods:

  • Context and span initialization
  • Error handling (RecordError + SetStatus)
  • Return patterns

Recommendation: Extract a generic helper function to reduce duplication. While this adds some complexity, it will make the code much more maintainable. Example pattern:

func traceCall[Req, Res any](
    ctx context.Context,
    tracer trace.Tracer,
    spanName string,
    attrs []attribute.KeyValue,
    call func(context.Context) (*connect.Response[Res], error),
    enrichSpan func(*connect.Response[Res], trace.Span),
) (*connect.Response[Res], error) {
    ctx, span := tracer.Start(ctx, spanName, trace.WithAttributes(attrs...))
    defer span.End()

    res, err := call(ctx)
    if err != nil {
        span.RecordError(err)
        span.SetStatus(codes.Error, err.Error())
        return nil, err
    }

    if enrichSpan != nil {
        enrichSpan(res, span)
    }
    return res, nil
}

This pattern would reduce each traced method from ~20 lines to ~5-10 lines.

Fix this →


4. Missing test coverage for per-store attributes (pkg/rpc/server/tracing_test.go:290-316)

The test TestTracedStoreService_GetP2PStoreInfo_Success only verifies the store_count attribute but doesn't check the individual store attributes (which aren't currently being set, per issue #1). Once you fix issue #1, you must update this test.

Recommendation: Add assertions like:

requireAttribute(t, attrs, "store_0_label", "Header Store")
requireAttribute(t, attrs, "store_0_height", int64(100))
requireAttribute(t, attrs, "store_1_label", "Data Store")
requireAttribute(t, attrs, "store_1_height", int64(99))

Code Quality & Architecture

5. Excellent use of the decorator pattern (pkg/rpc/server/tracing.go:18-30, 160-172, 217-229)

The decorator pattern is the right choice here. The WithTracing*Server functions cleanly wrap the inner handlers without modifying them. This is idiomatic Go and makes tracing optional and composable.

6. Good tracer naming convention (pkg/rpc/server/tracing.go:28, 170, 227)

Using distinct tracer names for each service (ev-node/store-service, ev-node/p2p-service, ev-node/config-service) is excellent. This allows filtering and grouping spans by service in tracing backends.

7. Appropriate span attribute choices (throughout tracing.go)

The attributes captured are meaningful and relevant:

  • Block retrieval: height/hash, found status, tx count, block size
  • State: height, app hash, DA height
  • P2P: peer count, node ID, listen addresses
  • Config: namespaces (hex-encoded), signer address (hex-encoded)

Good use of hex encoding for binary data to make traces human-readable.

8. Missing context propagation comment

While the code correctly propagates context through the traced calls, there's no documentation explaining that this enables distributed tracing across service boundaries. Consider adding a package-level comment explaining this.


Testing

9. Comprehensive test coverage structure (pkg/rpc/server/tracing_test.go)

Good test organization:

  • Separate tests for success and error cases
  • Mock implementations using interfaces
  • Helper functions to reduce setup boilerplate (setupStoreTrace, setupP2PTrace, setupConfigTrace)
  • Generic assertion helper (requireAttribute)

10. Good use of OpenTelemetry SDK test utilities (pkg/rpc/server/tracing_test.go:113-138)

Using tracetest.SpanRecorder is the right approach for testing tracing logic without needing a real tracing backend. The setup properly registers a test tracer provider and cleans it up.

11. Error test coverage (lines 182-202, 348-365, 425-442, 469-486)

Good coverage of error scenarios. Error tests verify that:

  • Spans are created even when errors occur
  • Errors are recorded on the span
  • Status is set to codes.Error

One minor improvement: Consider verifying the error message is captured correctly.


Performance & Resource Efficiency

12. Efficient attribute allocation (throughout tracing.go)

Attributes are built incrementally and passed to WithAttributes, which is more efficient than calling SetAttributes multiple times. Good pattern.

13. Potential allocation concern in GetBlock (pkg/rpc/server/tracing.go:60-67)

The block size calculation iterates over all transactions to sum their sizes. This is fine for observability, but worth noting that it adds O(n) overhead for tracing. Consider documenting this or making it conditional on a verbosity setting if block sizes become very large.


Security

14. Sensitive data in traces (pkg/rpc/server/tracing.go:267)

The signer address is included in traces as a hex-encoded string. While this is likely public information in a blockchain context, verify that this is intentional and documented. Consider whether this could leak any sensitive information in your threat model.

15. No PII concerns detected

The other traced data (hashes, heights, peer IDs, namespace IDs) are all technical metadata without personal information. Good.


Integration

16. Clean integration in server.go (pkg/rpc/server/server.go:385-389)

The conditional wrapping based on config.Instrumentation.IsTracingEnabled() is exactly right. Tracing is opt-in and has zero overhead when disabled (no spans created, no decorators applied).

17. Dependency updates look reasonable (execution/grpc/go.mod, execution/grpc/go.sum)

The dependency changes appear to be version bumps. No security concerns noted, but ensure you've reviewed the changelogs for breaking changes.


Documentation

18. Missing package-level documentation

The tracing.go file lacks a package comment explaining:

  • The purpose of the decorators
  • How tracing is enabled/disabled
  • What attributes are captured and why
  • Examples of querying traces

Recommendation: Add a comprehensive package or file comment.

19. Function documentation is adequate

Individual functions have clear names and the decorator functions have brief comments. Could be expanded but acceptable.


Summary

Strengths:

  • Clean decorator pattern implementation
  • Comprehensive test coverage structure
  • Good attribute choices for observability
  • Proper error handling in traces
  • Zero overhead when tracing is disabled

Must Fix:

  1. Add per-store attributes to GetP2PStoreInfo (critical observability gap)
  2. Update test to verify per-store attributes
  3. Consider extracting generic helper to reduce duplication (maintainability)

Nice to Have:
4. Package-level documentation
5. Document the block size calculation overhead
6. Verify signer address in traces is intentional


Recommendations Priority

  1. P0 (Must fix before merge): Issue ABCI interface #1 (missing per-store attributes)
  2. P0 (Must fix before merge): Issue Fraud proofs gossiping #4 (update test for per-store attributes)
  3. P1 (Should fix): Issue Transaction gossiping #3 (reduce duplication with generic helper)
  4. P2 (Nice to have): Issues Define Leader Selection for Cosmos-SDK based Rollup chain #18, Use addresses in multiaddr format. #19 (documentation improvements)
  5. P3 (Consider): Issues Drop-in replacement of Tendermint Node #13, P2P bootstrapping #14 (performance and security notes)

Overall, this is quality work that meaningfully improves the observability of the RPC layer. The main concerns are the missing per-store attributes and code duplication, both of which should be addressed before merge.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 13, 2026

The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed⏩ skipped✅ passed✅ passedJan 15, 2026, 8:40 AM

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces OpenTelemetry tracing for the Store, P2P, and Config services. The implementation uses a decorator pattern to wrap the existing service handlers, which is a good approach. The changes are well-tested, with a new test file covering the tracing logic. I've identified a few areas for improvement: a bug in attribute key generation, an opportunity to simplify the code in one of the traced methods, a suggestion to reduce code duplication across the tracing decorators, and a gap in the test coverage. Overall, this is a great addition for observability.

)
for i, store := range res.Msg.Stores {
span.SetAttributes(
attribute.Int64("store_"+string(rune(i))+"_height", int64(store.Height)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

There's a bug in how the attribute key is constructed. string(rune(i)) will convert the integer index i into its corresponding Unicode character, not its string representation. For example, for i=0, it will produce a null character (\x00), which is likely not what you intend and may cause issues with tracing backends.

You should use strconv.Itoa(i) to convert the integer to a string. This will require adding an import for the strconv package.

			attribute.Int64("store_"+strconv.Itoa(i)+"_height", int64(store.Height)),

Comment on lines 36 to 54
var height uint64
switch identifier := req.Msg.Identifier.(type) {
case *pb.GetBlockRequest_Height:
height = identifier.Height
case *pb.GetBlockRequest_Hash:
// for hash-based queries, we'll add the hash as an attribute
}

ctx, span := t.tracer.Start(ctx, "StoreService.GetBlock",
trace.WithAttributes(
attribute.Int64("height", int64(height)),
),
)
defer span.End()

// add hash attribute if hash-based query
if hashIdentifier, ok := req.Msg.Identifier.(*pb.GetBlockRequest_Hash); ok {
span.SetAttributes(attribute.String("hash", hex.EncodeToString(hashIdentifier.Hash)))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic for setting span attributes based on the request identifier can be simplified. Currently, you're using a switch to get the height, and then a separate if to get the hash, which involves checking the type of req.Msg.Identifier twice. This can be done more cleanly in a single switch statement that builds the attributes before starting the span. This will make the code more readable and efficient.

	var attrs []attribute.KeyValue
	switch identifier := req.Msg.Identifier.(type) {
	case *pb.GetBlockRequest_Height:
		attrs = append(attrs, attribute.Int64("height", int64(identifier.Height)))
	case *pb.GetBlockRequest_Hash:
		if identifier.Hash != nil {
			attrs = append(attrs, attribute.String("hash", hex.EncodeToString(identifier.Hash)))
		}
	}

	ctx, span := t.tracer.Start(ctx, "StoreService.GetBlock", trace.WithAttributes(attrs...))
	defer span.End()

Comment on lines +86 to +91
res, err := t.inner.GetState(ctx, req)
if err != nil {
span.RecordError(err)
span.SetStatus(codes.Error, err.Error())
return nil, err
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This block of code for handling errors from the inner service call and updating the span is repeated in almost every traced method in this file. To improve maintainability and reduce boilerplate, consider extracting this logic into a generic helper function.

For example, you could create a function that takes the context, request, the inner function call, and returns the response and error, handling the span creation and error recording internally. Here's a conceptual example:

func traceUnary[Req, Res any](
    tracer trace.Tracer,
    ctx context.Context,
    spanName string,
    req *connect.Request[Req],
    call func(context.Context, *connect.Request[Req]) (*connect.Response[Res], error),
    // ... other params for attributes
) (*connect.Response[Res], error) {
    ctx, span := tracer.Start(ctx, spanName)
    defer span.End()

    res, err := call(ctx, req)
    if err != nil {
        span.RecordError(err)
        span.SetStatus(codes.Error, err.Error())
        return nil, err
    }

    // ... set success attributes

    return res, nil
}

Applying this pattern would make the tracing decorators much more concise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having a bit of duplication isn't too bad to avoid a generics function that takes a function as an arg, makes things harder to reason about, but if anyone else wants this to be implemented I can do it.

Comment on lines +314 to +315
attrs := span.Attributes()
requireAttribute(t, attrs, "store_count", 2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This test is missing assertions for the per-store attributes (store_0_height, store_1_height). The current test only checks for store_count. Adding assertions for individual store heights would have caught the attribute key generation bug in GetP2PStoreInfo and would make the test more robust.

	attrs := span.Attributes()
	requireAttribute(t, attrs, "store_count", 2)
	requireAttribute(t, attrs, "store_0_height", int64(100))
	requireAttribute(t, attrs, "store_1_height", int64(99))

@codecov
Copy link

codecov bot commented Jan 13, 2026

Codecov Report

❌ Patch coverage is 82.23684% with 27 lines in your changes missing coverage. Please review.
✅ Project coverage is 58.11%. Comparing base (34034c1) to head (cb10d69).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
pkg/rpc/server/tracing.go 84.13% 18 Missing and 5 partials ⚠️
pkg/rpc/server/server.go 42.85% 3 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2972      +/-   ##
==========================================
+ Coverage   57.77%   58.11%   +0.34%     
==========================================
  Files          98       99       +1     
  Lines        9394     9543     +149     
==========================================
+ Hits         5427     5546     +119     
- Misses       3364     3388      +24     
- Partials      603      609       +6     
Flag Coverage Δ
combined 58.11% <82.23%> (+0.34%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Base automatically changed from cian/add-tracing-part-5 to main January 13, 2026 16:14
@chatton chatton marked this pull request as ready for review January 14, 2026 15:54
da.test Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to delete

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah good catch, I branched off a previous PR which I removed this in but it stayed around in here apparently!

@chatton chatton requested a review from tac0turtle January 15, 2026 10:30
@tac0turtle tac0turtle merged commit 3146f0c into main Jan 15, 2026
30 checks passed
@tac0turtle tac0turtle deleted the tracing-part-6 branch January 15, 2026 10:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants