-
Notifications
You must be signed in to change notification settings - Fork 65
feat: Initial Condense processor implementation #1695
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: Initial Condense processor implementation #1695
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1695 +/- ##
==========================================
+ Coverage 84.32% 84.45% +0.13%
==========================================
Files 465 466 +1
Lines 134493 136264 +1771
==========================================
+ Hits 113405 115081 +1676
- Misses 20554 20649 +95
Partials 534 534
🚀 New features to boost your workflow:
|
d2c2b9e to
78b3686
Compare
| } | ||
| }; | ||
|
|
||
| let mut parent_to_attrs: HashMap<u16, Vec<(&str, String)>> = HashMap::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With use of HashMap, same logical data could produce different strings. Eg,
Log A: "x=1;y=2"
Log B: "y=2;x=1"
Backend searches for condensed values will miss results. Is this Intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it is an explicit requirement for Condense to produce a certain order. The use case I had in mind was an 'overflow' field on whatever backend is being used. If an end user wants to do some further manipulation, I would expect them to leverage individual keys directly either with a parse or a simple contains over the condensed value.
i.e.
// parse specific value out
| parse condensed with * "x=" value:int ";" *
| where value == 5
// basic filter
| where condensed contains "x=5"I don't really foresee a use case where it is required for multiple keys to be in specific order together inside the condensed value.
We could always produce a reliable order by sorting, but that incurs performance penalty.
Related to #1435
Fixes #1693
This is a basic implementation of the
Condensebehavior from the above issue that works forLogAttrspayload types.This iteration currently builds an entirely new
RecordBatchduring execution. As mentioned in comments, once #1035 is completed, working in-place on the existingRecordBatchwould be more efficient especially with respect to persisted attributes.