lore: fix date filtering being applied after FTS limit#20
Open
theihor wants to merge 1 commit intofacebookexperimental:mainfrom
Open
lore: fix date filtering being applied after FTS limit#20theihor wants to merge 1 commit intofacebookexperimental:mainfrom
theihor wants to merge 1 commit intofacebookexperimental:mainfrom
Conversation
The lore search with --since/--until and --limit produced incorrect results: e.g. --limit 20 returned 0 results while --limit 30 returned 30 results. The root cause was that date filtering happened in Rust after the FTS query already applied its limit, so the FTS returned top-N candidates by text relevance (regardless of date), then date filtering removed most or all of them. Add a date_timestamp (Unix epoch) field to LoreEmailInfo and the lore table schema, populated at indexing time from the RFC 2822 date header. Use this field to push date filtering into the database via .only_if() on FTS queries and SQL WHERE clauses on non-FTS queries, so the limit is applied to already-date-filtered results. Specific changes: - types.rs: add date_timestamp: i64 to LoreEmailInfo - schema.rs: add date_timestamp Int64 column to lore table schema - indexer.rs: add parse_rfc2822_to_timestamp() helper, populate date_timestamp when parsing emails - connection.rs: rewrite search_lore_emails() to combine FTS with .only_if(date_timestamp filter) so limit applies post-date-filter; fix adaptive expansion loop early-exit when first iteration yields zero results; rewrite search_lore_emails_multi_field() to use new fetch_lore_emails_by_message_ids_with_filter() with SQL date filter; rewrite search_lore_emails_by_subject() to include date_timestamp in WHERE clause; update all lore read sites for the new column - search.rs: fix column index offsets in update_lore_vectors() and fetch_emails_by_ids() to account for the new date_timestamp column at index 3 Assisted-by: Claude Code (claude-opus-4-6) Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Author
|
@masoncl the change was vibe-coded (I don't really know Rust, but I have some idea how one should query a database, lol). I verified locally that the bug is fixed: the filter+limit queries now work as expected, and they got pretty snappy. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The lore search with --since/--until and --limit produced incorrect results: e.g. --limit 20 returned 0 results while --limit 30 returned 30 results. The root cause was that date filtering happened in Rust after the FTS query already applied its limit, so the FTS returned top-N candidates by text relevance (regardless of date), then date filtering removed most or all of them.
Add a date_timestamp (Unix epoch) field to LoreEmailInfo and the lore table schema, populated at indexing time from the RFC 2822 date header. Use this field to push date filtering into the database via .only_if() on FTS queries and SQL WHERE clauses on non-FTS queries, so the limit is applied to already-date-filtered results.
Specific changes:
Assisted-by: Claude Code (claude-opus-4-6)