Skip to content

Conversation

@drewrelmas
Copy link
Contributor

Related to #1435

Fixes #1693

This is a basic implementation of the Condense behavior from the above issue that works for LogAttrs payload types.

This iteration currently builds an entirely new RecordBatch during execution. As mentioned in comments, once #1035 is completed, working in-place on the existing RecordBatch would be more efficient especially with respect to persisted attributes.

@github-actions github-actions bot added the rust Pull requests that update Rust code label Dec 26, 2025
@codecov
Copy link

codecov bot commented Dec 26, 2025

Codecov Report

❌ Patch coverage is 88.06180% with 85 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.45%. Comparing base (b639b30) to head (d73b199).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1695      +/-   ##
==========================================
+ Coverage   84.32%   84.45%   +0.13%     
==========================================
  Files         465      466       +1     
  Lines      134493   136264    +1771     
==========================================
+ Hits       113405   115081    +1676     
- Misses      20554    20649      +95     
  Partials      534      534              
Components Coverage Δ
otap-dataflow 85.96% <88.06%> (+0.01%) ⬆️
query_abstraction 80.61% <ø> (ø)
query_engine 90.39% <ø> (+0.25%) ⬆️
syslog_cef_receivers ∅ <ø> (∅)
otel-arrow-go 53.50% <ø> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@drewrelmas drewrelmas marked this pull request as ready for review December 26, 2025 19:25
@drewrelmas drewrelmas requested a review from a team as a code owner December 26, 2025 19:25
@drewrelmas drewrelmas changed the title Initial Condense processor implementation feat: Initial Condense processor implementation Dec 26, 2025
@drewrelmas drewrelmas force-pushed the drewrelmas/condense-processor branch from d2c2b9e to 78b3686 Compare December 26, 2025 21:03
}
};

let mut parent_to_attrs: HashMap<u16, Vec<(&str, String)>> = HashMap::new();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With use of HashMap, same logical data could produce different strings. Eg,

 Log A: "x=1;y=2"
 Log B: "y=2;x=1" 

Backend searches for condensed values will miss results. Is this Intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is an explicit requirement for Condense to produce a certain order. The use case I had in mind was an 'overflow' field on whatever backend is being used. If an end user wants to do some further manipulation, I would expect them to leverage individual keys directly either with a parse or a simple contains over the condensed value.

i.e.

// parse specific value out
| parse condensed with * "x=" value:int ";" *
| where value == 5

// basic filter
| where condensed contains "x=5"

I don't really foresee a use case where it is required for multiple keys to be in specific order together inside the condensed value.

We could always produce a reliable order by sorting, but that incurs performance penalty.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

rust Pull requests that update Rust code

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

Initial implementation of Condense in experimental processor

2 participants